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Preface 


There are so many excellent books on finite difference methods for ordinary and 
partial differential equations that writing yet another one requires a different view 
on the topic. The present book is not so concerned with the traditional academic 
presentation of the topic, but is focused at teaching the practitioner how to obtain 
reliable computations involving finite difference methods. This focus is based on a 
set of learning outcomes: 


understanding of the ideas behind finite difference methods, 

understanding how to transform an algorithm to a well-designed computer code, 
understanding how to test (verify) the code, 

understanding potential artifacts in simulation results. 


ade i aoe 


Compared to other textbooks, the present one has a particularly strong emphasis 
on computer implementation and verification. It also has a strong emphasis on an 
intuitive understanding of constructing finite difference methods. To learn about 
the potential non-physical artifacts of various methods, we study exact solutions 
of finite difference schemes as these give deeper insight into the physical behavior 
of the numerical methods than the traditional (and more general) asymptotic error 
analysis. However, asymptotic results regarding convergence rates, typically trun- 
cation errors, are crucial for testing implementations, so an extensive appendix is 
devoted to the computation of truncation errors. 


Why finite differences? One may ask why we do finite differences when finite 
element and finite volume methods have been developed to greater generality and 
sophistication than finite differences and can cover more problems. The finite ele- 
ment and finite volume methods are also the industry standard nowadays. Why not 
just those methods? The reason for finite differences is the method’s simplicity, both 
from a mathematical and coding perspective. Especially in academia, where simple 
model problems are used a lot for teaching and in research (e.g., for verification of 
advanced implementations), there is a constant need to solve the model problems 
from scratch with easy-to-verify computer codes. Here, finite differences are ideal. 
A simple 1D heat equation can of course be solved by a finite element package, but 
a 20-line code with a difference scheme is just right to the point and provides an 
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understanding of all details involved in the model and the solution method. Every- 
body nowadays has a laptop and the natural method to attack a 1D heat equation is 
a simple Python or Matlab program with a difference scheme. The conclusion goes 
for other fundamental PDEs like the wave equation and Poisson equation as long 
as the geometry of the domain is a hypercube. The present book contains all the 
practical information needed to use the finite difference tool in a safe way. 

Various pedagogical elements are utilized to reach the learning outcomes, and 
these are commented upon next. 


Simplify, understand, generalize The book’s overall pedagogical philosophy is 
the three-step process of first simplifying the problem to something we can under- 
stand in detail, and when that understanding is in place, we can generalize and 
hopefully address real-world applications with a sound scientific problem-solving 
approach. For example, in the chapter on a particular family of equations we first 
simplify the problem in question to a 1D, constant-coefficient equation with simple 
boundary conditions. We learn how to construct a finite difference method, how to 
implement it, and how to understand the behavior of the numerical solution. Then 
we can generalize to higher dimensions, variable coefficients, a source term, and 
more complicated boundary conditions. The solution of a compound problem is in 
this way an assembly of elements that are well understood in simpler settings. 


Constructive mathematics This text favors a constructive approach to mathemat- 
ics. Instead of a set of definitions followed by popping up a method, we emphasize 
how to think about the construction of a method. The aim is to obtain a good intu- 
itive understanding of the mathematical methods. 

The text is written in an easy-to-read style much inspired by the following quote. 


Some people think that stiff challenges are the best device to induce learning, but I am not 
one of them. The natural way to learn something is by spending vast amounts of easy, 
enjoyable time at it. This goes whether you want to speak German, sight-read at the piano, 
type, or do mathematics. Give me the German storybook for fifth graders that I feel like 
reading in bed, not Goethe and a dictionary. The latter will bring rapid progress at first, 
then exhaustion and failure to resolve. 

The main thing to be said for stiff challenges is that inevitably we will encounter them, 
so we had better learn to face them boldly. Putting them in the curriculum can help teach 
us to do so. But for teaching the skill or subject matter itself, they are overrated. [18, p. 86] 
Lloyd N. Trefethen, Applied Mathematician, 1955-. 


This book assumes some basic knowledge of finite difference approximations, 
differential equations, and scientific Python or MATLAB programming, as often 
met in an introductory numerical methods course. Readers without this background 
may start with the light companion book “Finite Difference Computing with Expo- 
nential Decay Models” [9]. That book will in particular be a useful resource for the 
programming parts of the present book. Since the present book deals with partial 
differential equations, the reader is assumed to master multi-variable calculus and 
linear algebra. 

Fundamental ideas and their associated scientific details are first introduced in 
the simplest possible differential equation setting, often an ordinary differential 
equation, but in a way that easily allows reuse in more complex settings with par- 
tial differential equations. With this approach, new concepts are introduced with a 
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minimum of mathematical details. The text should therefore have a potential for 
use early in undergraduate student programs. 


All nuts and bolts Many have experienced that “vast amounts of easy, enjoyable 
time”, as stated in the quote above, arises when mathematics is implemented on 
a computer. The implementation process triggers understanding, creativity, and 
curiosity, but many students find the transition from a mathematical algorithm to a 
working code difficult and spend a lot of time on “programming issues”. 

Most books on numerical methods concentrate on the mathematics of the subject 
while details on going from the mathematics to a computer implementation are 
less in focus. A major purpose of this text is therefore to help the practitioner by 
providing all nuts and bolts necessary for safely going from the mathematics to a 
well-designed and well-tested computer code. A significant portion of the text is 
consequently devoted to programming details. 


Python as programming language While MATLAB enjoys widespread popular- 
ity in books on numerical methods, we have chosen to use the Python programming 
language. Python is very similar to MATLAB, but contains a lot of modern soft- 
ware engineering tools that have become standard in the software industry and that 
should be adopted also for numerical computing projects. Python is at present also 
experiencing an exponential growth in popularity within the scientific computing 
community. One of the book’s goals is to present an up-to-date Python eco system 
for implementing finite difference methods. 


Program verification Program testing, called verification, is a key topic of the 
book. Good verification techniques are indispensable when debugging computer 
code, but also fundamental for achieving reliable simulations. Two verification 
techniques saturate the book: exact solution of discrete equations (where the ap- 
proximation error vanishes) and empirical estimation of convergence rates in prob- 
lems with exact (analytical or manufactured) solutions of the differential equa- 
tion(s). 


Vectorized code Finite difference methods lead to code with loops over large ar- 
rays. Such code in plain Python is known to run slowly. We demonstrate, especially 
in Appendix C, how to port loops to fast, compiled code in C or Fortran. However, 
an alternative is to vectorize the code to get rid of explicit Python loops, and this 
technique is met throughout the book. Vectorization becomes closely connected to 
the underlying array library, here numpy, and is often thought of as a difficult sub- 
ject by students. Through numerous examples in different contexts, we hope that 
the present book provides a substantial contribution to explaining how algorithms 
can be vectorized. Not only will this speed up serial code, but with a library that can 
produce parallel code from numpy commands (such as Numba!), vectorized code 
can be automatically turned into parallel code and utilize multi-core processors and 
GPUs. Also when creating tailored parallel code for today’s supercomputers, vec- 
torization is useful as it emphasizes splitting up an algorithm into plain and simple 


' http://numba.pydata.org 
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array operations, where each operation is trivial to parallelize efficiently, rather than 
trying to develop a “smart” overall parallelization strategy. 


Analysis via exact solutions of discrete equations Traditional asymptotic analy- 
sis of errors is important for verification of code using convergence rates, but gives a 
limited understanding of how and why a correctly implemented numerical method 
may give non-physical results. By developing exact solutions, usually based on 
Fourier methods, of the discrete equations, one can obtain a physical understanding 
of the behavior of a numerical method. This approach is favored for analysis of 
methods in this book. 


Code-inspired mathematical notation Our primary aim is to have a clean and 
easy-to-read computer code, and we want a close one-to-one relationship between 
the computer code and mathematical description of the algorithm. This principle 
calls for a mathematical notation that is governed by the natural notation in the 
computer code. The unknown is mostly called u, but the meaning of the symbol u 
in the mathematical description changes as we go from the exact solution fulfilling 
the differential equation to the symbol u that is naturally used for the associated 
data structure in the code. 


Limited scope The aim of this book is not to give an overview of a lot of methods 
for a wide range of mathematical models. Such information can be found in numer- 
ous existing, more advanced books. The aim is rather to introduce basic concepts 
and a thorough understanding of how to think about computing with finite differ- 
ence methods. We therefore go in depth with only the most fundamental methods 
and equations. However, we have a multi-disciplinary scope and address the inter- 
play of mathematics, numerics, computer science, and physics. 


Focus on wave phenomena Most books on finite difference methods, or books 
on theory with computer examples, have their emphasis on diffusion phenomena. 
Half of this book (Chap. 1, 2, and Appendix C) is devoted to wave phenomena. 
Extended material on this topic is not so easy find in the literature, so the book 
should be a valuable contribution in this respect. Wave phenomena is also a good 
topic in general for choosing the finite difference method over other discretization 
methods since one quickly needs fine resolution over the entire mesh and uniform 
meshes are most natural. 

Instead of introducing the finite difference method for diffusion problems, where 
one soon ends up with matrix systems, we do the introduction in a wave phenomena 
setting where explicit schemes are most relevant. This slows down the learning 
curve since we can introduce a lot of theory for differences and for software aspects 
in a context with simple, explicit stencils for updating the solution. 


Independent chapters Most book authors are careful with avoiding repetitions of 
material. The chapters in this book, however, contain some overlap, because we 
want the chapters to appear meaningful on their own. Modern publishing technol- 
ogy makes it easy to take selected chapters from different books to make a new book 
tailored to a specific course. The more a chapter builds on details in other chapters, 
the more difficult it is to reuse chapters in new contexts. Also, most readers find it 
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convenient that important information is explicitly stated, even if it was already met 
in another chapter. 


Supplementary materials All program and data files referred to in this book are 
available from the book’s primary web site: URL: http://github.com/hplgit/fdm- 
book/. 
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Vibration ODEs 


Vibration problems lead to differential equations with solutions that oscillate in 
time, typically in a damped or undamped sinusoidal fashion. Such solutions put 
certain demands on the numerical methods compared to other phenomena whose 
solutions are monotone or very smooth. Both the frequency and amplitude of the 
oscillations need to be accurately handled by the numerical schemes. The forthcom- 
ing text presents a range of different methods, from classical ones (Runge-Kutta and 
midpoint/Crank-Nicolson methods), to more modern and popular symplectic (ge- 
ometric) integration schemes (Leapfrog, Euler-Cromer, and St6rmer-Verlet meth- 
ods), but with a clear emphasis on the latter. Vibration problems occur throughout 
mechanics and physics, but the methods discussed in this text are also fundamen- 
tal for constructing successful algorithms for partial differential equations of wave 
nature in multiple spatial dimensions. 


1.1 Finite Difference Discretization 


Many of the numerical challenges faced when computing oscillatory solutions to 
ODEs and PDEs can be captured by the very simple ODE u” + u = 0. This ODE 
is thus chosen as our starting point for method development, implementation, and 
analysis. 


1.1.1 A Basic Model for Vibrations 

The simplest model of a vibrating mechanical system has the following form: 
u”+o°u=0, u(0)= Z, w'(0)=0, t € (0,T]. (1.1) 

Here, w and I are given constants. Section 1.12.1 derives (1.1) from physical prin- 

ciples and explains what the constants mean. 


The exact solution of (1.1) is 


u(t) = I cos(wt). (1.2) 


© The Author(s) 2017 1 
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That is, u oscillates with constant amplitude J and angular frequency w. The corre- 
sponding period of oscillations (i.e., the time between two neighboring peaks in the 
cosine function) is P = 2/w. The number of periods per second is f = w/(27) 
and measured in the unit Hz. Both f and w are referred to as frequency, but w is 
more precisely named angular frequency, measured in rad/s. 

In vibrating mechanical systems modeled by (1.1), u(t) very often represents 
a position or a displacement of a particular point in the system. The derivative 
u'(t) then has the interpretation of velocity, and u” (t) is the associated acceleration. 
The model (1.1) is not only applicable to vibrating mechanical systems, but also to 
oscillations in electrical circuits. 


1.1.2 A Centered Finite Difference Scheme 


To formulate a finite difference method for the model problem (1.1), we follow the 
four steps explained in Section 1.1.2 in [9]. 


Step 1: Discretizing the domain The domain is discretized by introducing a 


uniformly partitioned time mesh. The points in the mesh are t = nAt, n = 
0,1,...,N;, where At = T/N; is the constant length of the time steps. We in- 
troduce a mesh function u” for n = 0,1,...,N;, which approximates the exact 


solution at the mesh points. (Note that n = 0 is the known initial condition, so 
u” is identical to the mathematical u at this point.) The mesh function u” will be 
computed from algebraic equations derived from the differential equation problem. 


Step 2: Fulfilling the equation at discrete time points The ODE is to be satisfied 
at each mesh point where the solution must be found: 


U" (ta) +o ult) =0, n=1,...,N. (1.3) 
Step 3: Replacing derivatives by finite differences The derivative wu" (t,) is to be 
replaced by a finite difference approximation. A common second-order accurate 


approximation to the second-order derivative is 


u”+! — 2u” + u”! 


"(ty) & 1.4 
u" (tn) a (1.4) 
Inserting (1.4) in (1.3) yields 
n+l _ 2u” n—l 
7 ee (1.5) 


At? 


We also need to replace the derivative in the initial condition by a finite dif- 
ference. Here we choose a centered difference, whose accuracy is similar to the 
centered difference we used for u”: 


—— =0. (1.6) 
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Step 4: Formulating a recursive algorithm To formulate the computational al- 
gorithm, we assume that we have already computed u”~! and u”, such that u”+! is 
the unknown value to be solved for: 


yit! = 2u” — yr} _ At?w u" : (1.7) 


The computational algorithm is simply to apply (1.7) successively for n = 
1,2,..., N; — 1. This numerical scheme sometimes goes under the name St6rmer’s 
method, Verlet integration', or the Leapfrog method (one should note that Leapfrog 
is used for many quite different methods for quite different differential equations!). 


Computing the first step We observe that (1.7) cannot be used for n = 0 since 
the computation of u! then involves the undefined value u™! att = —At. The 
discretization of the initial condition then comes to our rescue: (1.6) implies u`! = 
u! and this relation can be combined with (1.7) for n = 0 to yield a value for u!: 


u! = 2u? — u! — At?wu®, 


which reduces to 1 
u! = u? — 54o u’. (1.8) 


Exercise 1.5 asks you to perform an alternative derivation and also to generalize the 
initial condition to u’(0) = V £ 0. 


The computational algorithm The steps for solving (1.1) become 
l. u? =I 

2. compute u! from (1.8) 

3. forn = 1,2,..., N; — 1: compute u”+! from (1.7) 


The algorithm is more precisely expressed directly in Python: 


t = linspace(0, T, Nt+1) # mesh points in time 


dt = t[1] - t[0] # constant time step 
u = zeros(Nt+1) # solution 

u[0] = I 

uli] = u[0] - 0.5*dt**2*w**2*u [0] 


for n in range(1, Nt): 
u[n+1] = 2*u[m] - u[n-1] - dt**2*w**2*u[n] 


Remark on using w for w in computer code 

In the code, we use w as the symbol for w. The reason is that the authors prefer w 
for readability and comparison with the mathematical w instead of the full word 
omega as variable name. 


' http://en.wikipedia.org/wiki/Verlet_integration 
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Operator notation We may write the scheme using a compact difference notation 
listed in Appendix A.1 (see also Section 1.1.8 in [9]). The difference (1.4) has the 
operator notation [D, D;u]” such that we can write: 


[D;D;u + @7u = 0f”. (1.9) 
Note that [D, D;u]" means applying a central difference with step At/2 twice: 


[D;u]" +? — [D;u]" 


[D,(D,u)]" = re 


which is written out as 


At At 


1 u”+! — u” u” — yr! yet! —2u" + yr} 
At ( ) 7 Ar? 
The discretization of initial conditions can in the operator notation be expressed 
as 
[u = J], [Duu = 0}, (1.10) 


where the operator [D>,u]" is defined as 


[Dyu = 4—4, (1.11) 


1.2 Implementation 
1.2.1 Making a Solver Function 


The algorithm from the previous section is readily translated to a complete Python 
function for computing and returning u?,u!,..., u^! and to, t),...,ty,, given the 
input 7, w, At, and T: 


import numpy as np 
import matplotlib.pyplot as plt 


def solver(I, w, dt, T): 
nnn 
Solve u’? + w**2*xu = 0 for t in (0,T], u(0)=I and u’(0)=0, 
by a central finite difference method with time step dt. 
nun 
dt = float (dt) 
Nt = int (round(T/dt) ) 
u = np.zeros(Nt+1) 
np.linspace(0, Nt*dt, Nt+1) 


ch 
i] 


ufo] = I 
ult] = u[0] - 0.5*dt**2*w**2*u [0] 
for n in range(1, Nt): 
u[n+1] = 2xu[n] - ulm-1] - dt**2+*w**2*u[n] 
return u, t 
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We have imported numpy and matplotlib under the names np and plt, respec- 
tively, as this is very common in the Python scientific computing community and 
a good programming habit (since we explicitly see where the different functions 
come from). An alternative is to do from numpy import * and a similar “import 
all” for Matplotlib to avoid the np and plt prefixes and make the code as close as 
possible to MATLAB. (See Section 5.1.4 in [9] for a discussion of the two types of 
import in Python.) 

A function for plotting the numerical and the exact solution is also convenient to 


have: 


def 


def 


uü exact (t, L, wi 
return I*np.cos(w*t) 


visualize C th i E 

pilteplot (tou Eo) 

t_fine = np.linspace(0, t[-1], 1001) # very fine mesh for u_e 
u_e = u_exact(t_fine, I, w) 

plt.hold(’on’) 

plt.plot(t_fine, u_e, ’b-’) 

plt.legend([’numerical’, ’exact’], loc=’upper left’) 
plt.xlabel(’t’) 

plt.ylabel(’u’) 

dt = t[1] - t[0] 

plt.title(’dt=%g’ % dt) 

umin = 1.2*u.min(); umax = -umin 

plt.axis((t[0], t[-1], umin, umax]) 
plt.savefig(’tmpi.png’); plt.savefig(’tmp1.pdf’) 


A corresponding main program calling these functions to simulate a given number 
of periods (num_periods) may take the form 


dt = 


=1 


2*pi 
0.05 


num_periods = 5 


P = 
T = 


2*pi/w # one period 
P*num_periods 


u, t = solver Cle v; dts D) 
visualize(u, t, I, w, dt) 


Adjusting some of the input parameters via the command line can be handy. 
Here is a code segment using the ArgumentParser tool in the argparse module 
to define option value (-option value) pairs on the command line: 


import argparse 

parser = argparse.ArgumentParser () 
parser.add_argument(’--I’, type=float, default=1.0) 
parser.add_argument(’--w’, type=float, default=2*pi) 
parser.add_argument(’--dt’, type=float, default=0.05) 
parser.add_argument (’--num_periods’, type=int, default=5) 


a= 


parser .parse_args() 


I, w, dt, num_periods = a.I, a.w, a.dt, a.num_periods 
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Such parsing of the command line is explained in more detail in Section 5.2.3 in 
[9]. 


A typical execution goes like 


Terminal 


Terminal> python vib_undamped.py --num_periods 20 --dt 0.1 


Computing wu’ In mechanical vibration applications one is often interested in com- 
puting the velocity v(t) = u’(t) after u(t) has been computed. This can be done by 
a central difference, 


P u”+! pen u”! 
th) =U (t) x v” = — = [Dyu]. 1.12 
Ulin) = Wl (in) 0" = — — = [Don] (1.12) 
This formula applies for all inner mesh points, n = 1,..., N, — 1. For n = 0, v(0) 


is given by the initial condition on u'(0), and for n = N, we can use a one-sided, 
backward difference: 
: [D> p u” — u”! 
v” = ul’ = 
: At 
Typical (scalar) code is 


v = np.zeros_like(u) # or v = np.zeros(len(u)) 
# Use central difference for internal points 
for i in range(1, len(u)-1): 
vli] = (ulit+i1] - uli-1])/(2*dt) 
# Use initial condition for u’(0) when i=0 
v[0] = 0 
# Use backward difference at the final mesh point 
Wil) S Gu] = Al 


Since the loop is slow for large N;, we can get rid of the loop by vectorizing the 
central difference. The above code segment goes as follows in its vectorized version 
(see Problem 1.2 in [9] for explanation of details): 


v = np.zeros_like(u) 

v[1:-1] = (u[2:] - ul[:-2])/(2*dt) # central difference 

v[0] = 0 # boundary condition u’ (0) 
v[-1] = (u[-1] - ul-2])/dt # backward difference 


1.2.2 Verification 


Manual calculation The simplest type of verification, which is also instructive 
for understanding the algorithm, is to compute u!, u2, and u? with the aid of a 
calculator and make a function for comparing these results with those from the 
solver function. The test_three_steps function in the file vib_undamped. py 
shows the details of how we use the hand calculations to test the code: 
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def test_three_steps(): 

from math import pi 

T=1; w= 2*pi; dt = 0.1; T=1 

u_by_hand = np.array([1.000000000000000, 
0.802607911978213, 
0.288358920740053] ) 

u, t = solver(I, w, dt, D 

diff = np.abs(u_by_hand - u[:3]).max(Q) 

tol = 1E-14 

assert diff < tol 


This function is a proper test function, compliant with the pytest and nose testing 
framework for Python code, because 


e the function name begins with test_ 
e the function takes no arguments 
e the test is formulated as a boolean condition and executed by assert 


We shall in this book implement all software verification via such proper test func- 
tions, also known as unit testing. See Section 5.3.2 in [9] for more details on how to 
construct test functions and utilize nose or pytest for automatic execution of tests. 
Our recommendation is to use pytest. With this choice, you can run all test functions 
in vib_undamped. py by 


Terminal 


Terminal> py.test -s -v vib_undamped.py 
Ssesssssssssssssssssssssss=== test session starts ======... 
platform linux2 -- Python 2.7.9 -- ... 

collected 2 items 


vib_undamped.py::test_three_steps PASSED 
vib_undamped.py::test_convergence_rates PASSED 


=========================== 2 passed in 0.19 seconds ===... 


Testing very simple polynomial solutions Constructing test problems where the 
exact solution is constant or linear helps initial debugging and verification as one 
expects any reasonable numerical method to reproduce such solutions to machine 
precision. Second-order accurate methods will often also reproduce a quadratic 
solution. Here [D,D,t?]" = 2, which is the exact result. A solution u = t? 
leads to u” + œu = 2+ (wt)? # 0. We must therefore add a source in the 
equation: u” + w*u = f to allow a solution u = t? for f = 2 + (wt)*. By 
simple insertion we can show that the mesh function u” = ¢? is also a solution of 
the discrete equations. Problem 1.1 asks you to carry out all details to show that 
linear and quadratic solutions are solutions of the discrete equations. Such results 
are very useful for debugging and verification. You are strongly encouraged to do 
this problem now! 


Checking convergence rates Empirical computation of convergence rates yields 
a good method for verification. The method and its computational details are ex- 
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plained in detail in Section 3.1.6 in [9]. Readers not familiar with the concept should 
look up this reference before proceeding. 
In the present problem, computing convergence rates means that we must 


e perform m simulations, halving the time steps as: At; = 2 Atọ, i = 1,..., 
m — 1, and Az; is the time step used in simulation i; 


e compute the L? norm of the error, E; = (At pian, (u” —Ue(ty))? in each 


case; 

e estimate the convergence rates r; based on two consecutive experiments 
(Ati-1, Ei—1) and (At;, E;), assuming E; = C(Az;)’ and E; = C(At-1)’, 
where C is a constant. From these equations it follows that r = In(£;_1/E;)/ 
In(At;_;/At;). Since this r will vary with 7, we equip it with an index and call 
it r;—1, where i runs from 1 tom — 1. 


The computed rates 79,71,...,m—2 hopefully converge to the number 2 in the 
present problem, because theory (from Sect. 1.4) shows that the error of the 
numerical method we use behaves like At?. The convergence of the sequence 


Fo, Fi,- -, Fm-2 demands that the time steps At; are sufficiently small for the error 
model FE; = C(At;)" to be valid. 

All the implementational details of computing the sequence r,7|,...,/m—2 ap- 
pear below. 


def convergence_rates(m, solver_function, num_periods=8): 
nun 
Return m-1 empirical estimates of the convergence rate 
based on m simulations, where the time step is halved 
for each simulation. 
solver_function(I, w, dt, T) solves each problem, where T 
is based on simulation for num_periods periods. 


nun 


from math import pi 


w = 0.35; I = 0.3 # just chosen values 
P = 2*pi/w # period 
dt = P/30 # 30 time step per period 2*pi/w 


T = P*num_periods 


dt_values = [] 

E_values = [] 

for i in range(m): 
u, t = solver_function(I, w, dt, T) 
u_e = u_exact(t, I, w) 
E = np.sqrt (dt*np.sum((u_e-u) **2) ) 
dt_values. append (dt) 
E_values . append (E) 
dt = dt/2 


r = [np.log(E_values [i-1]/E_values[i])/ 
np.log(dt_values[i-1]/dt_values[i]) 
for i in range(1, m, 1)] 

return r, E_values, dt_values 


The error analysis in Sect. 1.4 is quite detailed and suggests that r = 2. Itis 
also a intuitively reasonable result, since we used a second-order accurate finite 
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difference approximation [D;D,u]” to the ODE and a second-order accurate finite 
difference formula for the initial condition for u’. 

In the present problem, when Af corresponds to 30 time steps per period, the 
returned r list has all its values equal to 2.00 (if rounded to two decimals). This 
amazingly accurate result means that all Aż; values are well into the asymptotic 
regime where the error model E; = C(Azt;)" is valid. 

We can now construct a proper test function that computes convergence rates 
and checks that the final (and usually the best) estimate is sufficiently close to 2. 
Here, a rough tolerance of 0.1 is enough. Later, we will argue for an improvement 
by adjusting omega and include also that case in our test function here. The unit 
test goes like 


def test_convergence_rates(): 
r, E, dt = convergence_rates( 
m=5, solver_function=solver, num_periods=8) 
# Accept rate to 1 decimal place 
tol = 0.1 
assert abs(r[-1] - 2.0) < tol 
# Test that adjusted w obtains 4th order convergence 
r, E, dt = convergence_rates( 
m=5, solver_function=solver_adjust_w, num_periods=8) 
print ’adjust w rates:’, r 
assert abs(r[-1] - 4.0) < tol 


The complete code appears in the file vib_undamped. py. 


Visualizing convergence rates with slope markers Tony S. Yu has written a 
script plotslopes. py” that is very useful to indicate the slope of a graph, es- 
pecially a graph like In E = rlnAt + InC arising from the model E = CAt’. 
A copy of the script resides in the src/vib® directory. Let us use it to compare 
the original method for u” + w*u = 0 with the same method applied to the equa- 
tion with a modified w. We make log-log plots of the error versus At. For each 
curve we attach a slope marker using the slope_marker((x,y), r) function 
from plotslopes.py, where (x,y) is the position of the marker and r and the 
slope ((r, 1)), here (2,1) and (4,1). 


def plot_convergence_rates(): 

r2, E2, dt2 = convergence_rates( 

m=5, solver_function=solver, num_periods=8) 
plt.loglog(dt2, E2) 
r4, E4, dt4 = convergence_rates( 

m=5, solver_function=solver_adjust_w, num_periods=8) 
plt.loglog(dt4, E4) 
plt.legend([’original scheme’, r’adjusted $\omega$’], 

loc=’upper left’) 

plt.title(’Convergence of finite difference methods’) 
from plotslopes import slope_marker 
slope_marker((dt2[1], E2[1]), (2,1)) 
slope_marker((dt4[1], E4[1]), (4,1)) 


? http://goo.gl/A4Utm7 
3 http://tinyurl.com/nu656p2/vib 
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Fig. 1.1 Empirical convergence rate curves with special slope marker 


Figure 1.1 displays the two curves with the markers. The match of the curve 
slope and the marker slope is excellent. 


1.2.3 Scaled Model 


It is advantageous to use dimensionless variables in simulations, because fewer pa- 
rameters need to be set. The present problem is made dimensionless by introducing 
dimensionless variables ¢ = t/t, and Ūū = u/u,, where te and uc are characteristic 
scales for t and u, respectively. We refer to Section 2.2.1 in [11] for all details about 
this scaling. 

The scaled ODE problem reads 


ue au 7 2 uc du 
2 dP +u.u=0, u-u(0) = I, Aa =0. 


A common choice is to take t¢ as one period of the oscillations, te = 27/w, and 
uc = I. This gives the dimensionless model 


u 
—— +4n7i =0, u(0) = 1, u'(0) =0. (1.13) 
Observe that there are no physical parameters in (1.13)! We can therefore perform 
a single numerical simulation #(f) and afterwards recover any u(t; w, I) by 
u(t;w, I) = u_u(t/t.) = Iu(wt/(27)). 


We can easily check this assertion: the solution of the scaled problem is u(f) = 
cos(2zf). The formula for u in terms of ù gives u = I cos(wt), which is nothing 
but the solution of the original problem with dimensions. 
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The scaled model can be run by calling solver (I=1, w=2*pi, dt, T). Each 
period is now | and T simply counts the number of periods. Choosing dt as 1./M 
gives M time steps per period. 


1.3 Visualization of Long Time Simulations 


Figure 1.2 shows a comparison of the exact and numerical solution for the scaled 
model (1.13) with At = 0.1, 0.05. From the plot we make the following observa- 
tions: 


e The numerical solution seems to have correct amplitude. 
e There is an angular frequency error which is reduced by decreasing the time step. 
e The total angular frequency error grows with time. 


By angular frequency error we mean that the numerical angular frequency differs 
from the exact w. This is evident by looking at the peaks of the numerical solution: 
these have incorrect positions compared with the peaks of the exact cosine solution. 
The effect can be mathematically expressed by writing the numerical solution as 
I cos æt, where @ is not exactly equal to w. Later, we shall mathematically quantify 
this numerical angular frequency @. 


1.3.1 Using a Moving Plot Window 


In vibration problems it is often of interest to investigate the system’s behavior 
over long time intervals. Errors in the angular frequency accumulate and become 
more visible as time grows. We can investigate long time series by introducing 
a moving plot window that can move along with the p most recently computed 
periods of the solution. The SciTools* package contains a convenient tool for 
this: MovingPlotWindow. Typing pydoc scitools.MovingPlotWindow shows 
a demo and a description of its use. The function below utilizes the moving plot 


dt=0.1 dt=0.05 


e- numerical e -æ numerical 
= _ exact : = _ exact 


Fig. 1.2 Effect of halving the time step 


4 https://github.com/hplgit/scitools 
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window and is in fact called by the main function in the vib_undamped module if 
the number of periods in the simulation exceeds 10. 


def visualize_front(u, t, I, w, savefig=False, skip_frames=1): 
nun 
Visualize u and the exact solution vs t, using a 
moving plot window and continuous drawing of the 
curves as they evolve in time. 
Makes it easy to plot very long time series. 
Plots are saved to files if savefig is True. 
Only each skip_frames-th plot is saved (e.g., if 
skip_frame=10, only each 10th plot is saved to file; 
this is convenient if plot files corresponding to 
different time steps are to be compared). 
nun 
import scitools.std as st 
from scitools.MovingPlotWindow import MovingPlotWindow 
from math import pi 


# Remove all old plot files tmp_*.png 

import glob, os 

for filename in glob.glob(’tmp_*.png’): 
os.remove (filename) 


P = 2*pi/w # one period 
umin = 1.2*u.min(); umax = -umin 
dt = t[1] - t0] 
plot_manager = MovingPlotWindow( 
window_width=8*P, 
dt=dt, 
yaxis=[umin, umax], 
mode=’continuous drawing’) 
frame_counter = 0 
for n in range(1,len(u)): 
if plot_manager.plot(n): 
s = plot_manager.first_index_in_plot 
st plotis nhir uls nni] E 
t[s:n+1], I*cos(w*t)[s:n+1], ’b-1’, 
title=’t=/6.3f’ % t[n], 
axis=plot_manager.axis(), 
show=not savefig) # drop window if savefig 
if savefig and n % skip_frames == 
filename = ’tmp_/04d.png’ % frame_counter 
st.savefig(filename) 
print ’making plot file’, filename, ’at t=%g’ % t[n] 
frame_counter += 1 
plot_manager .update (n) 


We run the scaled problem (the default values for the command-line arguments 
-I and -w correspond to the scaled problem) for 40 periods with 20 time steps per 
period: 


Terminal 


Terminal> python vib_undamped.py --dt 0.05 --num_periods 40 


1.3 Visualization of Long Time Simulations 13 


The moving plot window is invoked, and we can follow the numerical and exact 
solutions as time progresses. From this demo we see that the angular frequency 
error is small in the beginning, and that it becomes more prominent with time. A 
new run with At = 0.1 (i.e., only 10 time steps per period) clearly shows that 
the phase errors become significant even earlier in the time series, deteriorating the 
solution further. 


1.3.2 Making Animations 


Producing standard video formats The visualize_front function stores all 
the plots in files whose names are numbered: tmp_0000.png, tmp_0001.png, 
tmp_0002.png, and so on. From these files we may make a movie. The Flash 
format is popular, 


Terminal 


Terminal> ffmpeg -r 25 -i tmp_/04d.png -c:v flv movie.flv 


The ffmpeg program can be replaced by the avconv program in the above com- 
mand if desired (but at the time of this writing it seems to be more momentum in 
the ffmpeg project). The -r option should come first and describes the number 
of frames per second in the movie (even if we would like to have slow movies, 
keep this number as large as 25, otherwise files are skipped from the movie). The 
-i option describes the name of the plot files. Other formats can be generated by 
changing the video codec and equipping the video file with the right extension: 


Format Codec and filename 

Flash -c:v flv movie.flv 

MP4 -c:v 1ibx264 movie.mp4 
WebM -c:v libvpx movie.webm 
Ogg -c:v libtheora movie.ogg 


The video file can be played by some video player like vlc, mplayer, gxine, or 
totem, €.g., 


Terminal 


Terminal> vlc movie.webm 


A web page can also be used to play the movie. Today’s standard is to use the 
HTMLS5S video tag: 


<video autoplay loop controls 

width=’640’ height=’365’ preload=’none’> 
<source src=’movie.webm’ type=’video/webm; codecs="vp8, vorbis"’> 
</video> 
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Modern browsers do not support all of the video formats. MP4 is needed to suc- 
cessfully play the videos on Apple devices that use the Safari browser. WebM is 
the preferred format for Chrome, Opera, Firefox, and Internet Explorer v9+. Flash 
was a popular format, but older browsers that required Flash can play MP4. All 
browsers that work with Ogg can also work with WebM. This means that to have 
a video work in all browsers, the video should be available in the MP4 and WebM 
formats. The proper HTML code reads 


<video autoplay loop controls 

width=’640’ height=’365’ preload=’none’> 
<source src=’movie.mp4’ type=’video/mp4; 
codecs="avc1.42E01E, mp4a.40.2"’> 
<source src=’movie.webm’ type=’video/webm; 
codecs="vp8, vorbis"’> 
</video> 


The MP4 format should appear first to ensure that Apple devices will load the video 
correctly. 


Caution: number the plot files correctly 

To ensure that the individual plot frames are shown in correct order, it is im- 
portant to number the files with zero-padded numbers (0000, 0001, 0002, etc.). 
The printf format %04d specifies an integer in a field of width 4, padded with 
zeros from the left. A simple Unix wildcard file specification like tmp_*. png 
will then list the frames in the right order. If the numbers in the filenames were 
not zero-padded, the frame tmp_11.png would appear before tmp_2. png in the 
movie. 


Playing PNG files in a web browser The scitools movie command can create 
a movie player for a set of PNG files such that a web browser can be used to watch 
the movie. This interface has the advantage that the speed of the movie can easily 
be controlled, a feature that scientists often appreciate. The command for creating 
an HTML with a player for a set of PNG files tmp_* . png goes like 


Terminal 


Terminal> scitools movie output_file=vib.html fps=4 tmp_*.png 


The fps argument controls the speed of the movie (“frames per second”). 
To watch the movie, load the video file vib. html into some browser, e.g., 


Terminal 


Terminal> google-chrome vib.html # invoke web page 


13 Visualization of Long Time Simulations 15 


Click on Start movie to see the result. Moving this movie to some other place 
requires moving vib. html and all the PNG files tmp_* . png: 


Terminal 


Terminal> mkdir vib_dt0.1 
Terminal> mv tmp_*.png vib_dt0.1 
Terminal> mv vib.html vib_dt0.1/index.html 


Making animated GIF files The convert program from the ImageMagick soft- 
ware suite can be used to produce animated GIF files from a set of PNG files: 


Terminal 


Terminal> convert -delay 25 tmp_vib*.png tmp_vib. gif 


The -delay option needs an argument of the delay between each frame, measured 
in 1/100s, so 4 frames/s here gives 25/100s delay. Note, however, that in this 
particular example with At = 0.05 and 40 periods, making an animated GIF file 
out of the large number of PNG files is a very heavy process and not considered 
feasible. Animated GIFs are best suited for animations with not so many frames 
and where you want to see each frame and play them slowly. 


1.3.3 Using Bokeh to Compare Graphs 


Instead of a moving plot frame, one can use tools that allow panning by the mouse. 
For example, we can show four periods of several signals in several plots and then 
scroll with the mouse through the rest of the simulation simultaneously in all the 
plot windows. The Bokeh? plotting library offers such tools, but the plots must be 
displayed in a web browser. The documentation of Bokeh is excellent, so here we 
just show how the library can be used to compare a set of u curves corresponding to 
long time simulations. (By the way, the guidance to correct pronunciation of Bokeh 
in the documentation® and on Wikipedia’ is not directly compatible with a YouTube 
video® ...). 

Imagine we have performed experiments for a set of At values. We want each 
curve, together with the exact solution, to appear in a plot, and then arrange all plots 
in a grid-like fashion: 


5 http://bokeh. pydata.org/en/latest 

6 http://bokeh. pydata.org/en/0. 10.0/docs/faq.html#how-do-you-pronounce-bokeh 
7 https://en.wikipedia.org/wiki/Bokeh 

8 https://www.youtube.com/watch?v=OR8HSHevQTM 
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# time steps per period: 20 


Furthermore, we want the axes to couple such that if we move into the future in 
one plot, all the other plots follows (note the displaced f axes!): 


150 155 160 165 170 150 155 160 165 170 
t t 


A function for creating a Bokeh plot, given a list of u arrays and corresponding t 
arrays, is implemented below. The code combines data from different simulations, 
described compactly in a list of strings legends. 
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def bokeh_plot(u, t, legends, I, w, t_range, filename): 
UL 
Make plots for u vs t using the Bokeh library. 
u and t are lists (several experiments can be compared). 
legens contain legend strings for the various u,t pairs. 
nun 
if not isinstance(u, (list,tuple)): 
u = [u] # wrap in list 
if not isinstance(t, (list,tuple)): 
t = [t] # wrap in list 
if not isinstance(legends, (list,tuple)): 
legends = [legends] # wrap in list 


import bokeh.plotting as plt 
plt.output_file(filename, mode=’cdn’, title=’Comparison’) 
# Assume that all t arrays have the same range 
t_fine = np.linspace(0, t[0][-1], 1001) # fine mesh for u_e 
tools = ’pan,wheel_zoom,box_zoom,reset,’\ 
»>save,box_select,lasso_select’ 
u_range = [-1.2*I, 1.2*I] 
font_size = ’8pt’ 
p= [] # list of plot objects 
# Make the first figure 
p_ = plt.figure( 
width=300, plot_height=250, title=legends[0], 
X_axis_label=’t’, y_axis_label=’u’, 
xX_range=t_range, y_range=u_range, tools=tools, 
title_text_font_size=font_size) 
p_.xaxis.axis_label_text_font_size=font_size 
p_.yaxis.axis_label_text_font_size=font_size 
p_-line(t[0], u[0], line_color=’blue’) 
# Add exact solution 
u_e = u_exact(t_fine, I, w) 
p_.line(t_fine, u_e, line_color=’red’, line_dash=’4 4’) 
p-append(p_) 
# Make the rest of the figures and attach their axes to 
# the first figure’s axes 
for i in range(1, len(t)): 
p_ = plt.figure( 
width=300, plot_height=250, title=legends [i], 
X_axis_label=’t’, y_axis_label=’u’, 
x_range=p[0].x_range, y_range=p[0].y_range, tools=tools, 
title_text_font_size=font_size) 
p_.Xxaxis.axis_label_text_font_size = font_size 
p_.yaxis.axis_label_text_font_size = font_size 
p_.line(t[i], uli], line_color=’blue’) 
p_.line(t_fine, u_e, line_color=’red’, line_dash=’4 4’) 
p.append(p_) 


# Arrange all plots in a grid with 3 plots per row 
grid = [H] 
for i, p_ in enumerate(p): 
grid[-1].append(p_) 
if (i+1) % 3 == 0: 
# New row 
grid.append([]) 
plot = plt.gridplot(grid, toolbar_location=’ left’) 
plt.save (plot) 
plt.show(plot) 
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A particular example using the bokeh_plot function appears below. 


def demo_bokeh(): 
uuvSolve a scaled ODE u’?? + u = 0.""" 
from math import pi 
w= 1.0 # Scaled problem (frequency) 
P = 2tnp.pi/w # Period 
num_steps_per_period = [5, 10, 20, 40, 80] 


T = 40*P # Simulation time: 40 periods 
u= [] # List of numerical solutions 
t= [] # List of corresponding meshes 
legends = [] 
for n in num_steps_per_period: 

dt = P/n 


u_, t_ = solver(I=1, w=w, dt=dt, T=T) 
u.append(u_) 
t.append(t_) 
legends.append(’# time steps per period: %da? % n) 
bokeh_plot(u, t, legends, I=1, w=w, t_range=[0, 4*P], 
filename=’tmp.html’) 


1.3.4 Using a Line-by-Line Ascii Plotter 


Plotting functions vertically, line by line, in the terminal window using ascii char- 
acters only is a simple, fast, and convenient visualization technique for long time 
series. Note that the time axis then is positive downwards on the screen, so we can 
let the solution be visualized “forever”. The tool scitools.avplotter.Plotter 
makes it easy to create such plots: 


def visualize_front_ascii(u, t, I, w, fps=10): 
nun 
Plot u and the exact solution vs t line by line in a 
terminal window (only using ascii characters). 
Makes it easy to plot very long time series. 
nun 
from scitools.avplotter import Plotter 
import time 
from math import pi 
P = 2*pi/w 
umin = 1.2*u.min(); umax = -umin 


p = Plotter(ymin=umin, ymax=umax, width=60, symbols=’+o’) 
for n in range(len(u)): 
print p.plot(t[n], u[n], I*cos(w*t[n])), \ 
*h-1£’? % (t[n]/P) 
time.sleep(1/float (fps) ) 


The call p.plot returns a line of text, with the ¢ axis marked and a symbol + for 
the first function (u) and o for the second function (the exact solution). Here we 
append to this text a time counter reflecting how many periods the current time 
point corresponds to. A typical output (œw = 27, At = 0.05) looks like this: 
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1.3.5 Empirical Analysis of the Solution 


For oscillating functions like those in Fig. 1.2 we may compute the amplitude and 
frequency (or period) empirically. That is, we run through the discrete solution 
points (tn, un) and find all maxima and minima points. The distance between two 
consecutive maxima (or minima) points can be used as estimate of the local period, 
while half the difference between the u value at a maximum and a nearby minimum 
gives an estimate of the local amplitude. 
The local maxima are the points where 
n 


wUt<cu®>u™!, n=1,..., N,- 1, (1.14) 


and the local minima are recognized by 


u”! >u” <u"! n=1,...,N,—-1. (1.15) 


In computer code this becomes 


def minmax(t, u): 
minima = []; maxima = [] 
for n in range(1, len(u)-1, 1): 
Lf olin Sie E <ul ins 
minima.append((t[n], u[n])) 
if ufm-1] < ulm] > ufm+i)]: 
maxima.append((t[n], u[n])) 
return minima, maxima 


Note that the two returned objects are lists of tuples. 
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Let (t; ei), i = 0,...,M — 1, be the sequence of all the M maxima points, 
where t; is the time value and e; the corresponding u value. The local period can be 
defined as p; = t;+; — t;. With Python syntax this reads 


def periods (maxima) : 
p = [extrema[n] [0] - maxima[n-1] [0] 
for n in range(1, len(maxima))] 
return np.array(p) 


The list p created by a list comprehension is converted to an array since we probably 
want to compute with it, e.g., find the corresponding frequencies 2*pi/p. 

Having the minima and the maxima, the local amplitude can be calculated as the 
difference between two neighboring minimum and maximum points: 


def amplitudes(minima, maxima): 
a = [(abs(maxima[n] [1] - minima[n] [1]))/2.0 
for n in range(min(len(minima) ,len(maxima)))] 
return np.array (a) 


The code segments are found in the file vib_empirical_analysis. py. 

Since a[i] and p[i] correspond to the i-th amplitude estimate and the i-th 
period estimate, respectively, it is most convenient to visualize the a and p values 
with the index i on the horizontal axis. (There is no unique time point associated 
with either of these estimate since values at two different time points were used in 
the computations.) 

In the analysis of very long time series, it is advantageous to compute and plot p 
and a instead of u to get an impression of the development of the oscillations. Let 
us do this for the scaled problem and At = 0.1, 0.05, 0.01. A ready-made function 


plot_empirical_freq_and_amplitude(u, t, I, w) 


computes the empirical amplitudes and periods, and creates a plot where the am- 
plitudes and angular frequencies are visualized together with the exact amplitude I 
and the exact angular frequency w. We can make a little program for creating the 
plot: 


from vib_undamped import solver, plot_empirical_freq_and_amplitude 
from math import pi 
dt_values = [0.1, 0.05, 0.01] 
u_cases = [] 
t_cases o 
for dt in dt_values: 
# Simulate scaled problem for 40 periods 
u, t = solver(I=1, w=2*pi, dt=dt, T=40) 
u_cases . append (u) 
t_cases.append(t) 
plot_empirical_freq_and_amplitude(u_cases, t_cases, I=1, w=2*pi) 
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Fig. 1.3 Empirical angular frequency (left) and amplitude (right) for three different time steps 


Figure 1.3 shows the result: we clearly see that lowering At improves the angular 
frequency significantly, while the amplitude seems to be more accurate. The lines 
with At = 0.01, corresponding to 100 steps per period, can hardly be distinguished 
from the exact values. The next section shows how we can get mathematical insight 
into why amplitudes are good while frequencies are more inaccurate. 


1.4 Analysis of the Numerical Scheme 
1.4.1 Deriving a Solution of the Numerical Scheme 


After having seen the phase error grow with time in the previous section, we shall 
now quantify this error through mathematical analysis. The key tool in the analysis 
will be to establish an exact solution of the discrete equations. The difference equa- 
tion (1.7) has constant coefficients and is homogeneous. Such equations are known 
to have solutions on the form u” = CA”, where A is some number to be determined 
from the difference equation and C is found as the initial condition (C = J). Recall 
that n in u” is a superscript labeling the time level, while n in A” is an exponent. 

With oscillating functions as solutions, the algebra will be considerably simpli- 
fied if we seek an A on the form 


A= i@At 
and solve for the numerical frequency @ rather than A. Note that i = /—1 is the 


imaginary unit. (Using a complex exponential function gives simpler arithmetics 
than working with a sine or cosine function.) We have 
A” = ei@Atn — gion — cos(dt,) +i sin(@t,). 


The physically relevant numerical solution can be taken as the real part of this 
complex expression. 
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The calculations go as 


u”+! =u" + u”! 


D,Duļ” = 
cae At? 
B a= — 2A" + Aro! 
7 At? 
ið(tn+A iðtn ið(tn—A 
=an (m+At) __ J iO e (t D) 
9s 1 yr oe 
= [ei tn — ei@At 4 eiA) =) 
xa | ) 
ne, 2 
= Ie?" Ap (cosh(i@ At) — 1) 
voy 2 
= Ie'®™ — (cos(@At) — 1 
Aaa (cos(@Ar) — 1) 
mt 4 At 
= — Je” Are sin? (=) é 
The last line follows from the relation cos x — 1 = —2 sin? (x/2) (try cos (x) -1 in 
wolframalpha.com? to see the formula). 
The scheme (1.7) with u” = Je’®4'” inserted now gives 
in, 4 oAt 6 
— De® eo sin? (==) +o Iei’ = 0, (1.16) 


which after dividing by Je'®” results in 


4 vp At 
a3 sin? (==) = 0°. (1.17) 


The first step in solving for the unknown @ is 


_ 5 (@At wAt \* 
sinf | —— | = | ——] . 
2 2 
Then, taking the square root, applying the inverse sine function, and multiplying 


by 2/At, results in 
Oa reat Gada (1.18) 
= +— sn | —]. i 
” At 2 


1.4.2 The Error in the Numerical Frequency 


The first observation following (1.18) tells that there is a phase error since the nu- 
merical frequency @ never equals the exact frequency w. But how good is the 
approximation (1.18)? That is, what is the error w — ®© or @/qw? Taylor series ex- 
pansion for small At may give an expression that is easier to understand than the 
complicated function in (1.18): 


? http://www.wolframalpha.com 
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>>> from sympy import * 

>>> dt, w = symbols(’dt w’) 

>>> w_tilde_e = 2/dt*asin(w*dt/2) 

>>> w_tilde_series = w_tilde_e.series(dt, 0, 4) 
>>> print w_tilde_series 

w + dt**2*w**3/24 + O(dt**4) 


This means that i 
=o (1 + soar) + O(At*). (1.19) 


The error in the numerical frequency is of second-order in Aż, and the error vanishes 
as At — 0. We see that @ > œw since the term w*At*/24 > 0 and this is by far 
the biggest term in the series expansion for small wAr. A numerical frequency that 
is too large gives an oscillating curve that oscillates too fast and therefore “lags 
behind” the exact oscillations, a feature that can be seen in the left plot in Fig. 1.2. 

Figure 1.4 plots the discrete frequency (1.18) and its approximation (1.19) for 
w = | (based on the program vib_plot_freq. py). Although ð is a function of At 
in (1.19), itis misleading to think of At as the important discretization parameter. It 
is the product wAt that is the key discretization parameter. This quantity reflects the 
number of time steps per period of the oscillations. To see this, we set P = NpAt, 
where P is the length of a period, and Np is the number of time steps during a 
period. Since P and œ% are related by P = 27/w, we get that wAt = 21/Np, 
which shows that wAt is directly related to Np. 

The plot shows that at least Np ~ 25 — 30 points per period are necessary for 
reasonable accuracy, but this depends on the length of the simulation (T) as the total 
phase error due to the frequency error grows linearly with time (see Exercise 1.2). 
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Fig. 1.4 Exact discrete frequency and its second-order series expansion 
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1.4.3 Empirical Convergence Rates and Adjusted w 


The expression (1.19) suggests that adjusting omega to 


1 
o| 1- — Ar |, 
24 


could have effect on the convergence rate of the global error in u (cf. Sect. 1.2.2). 
With the convergence_rates function in vib_undamped.py we can easily 
check this. A special solver, with adjusted w, is available as the function 
solver_adjust_w. A call to convergence_rates with this solver reveals that 
the rate is 4.0! With the original, physical w the rate is 2.0 — as expected from using 
second-order finite difference approximations, as expected from the forthcom- 
ing derivation of the global error, and as expected from truncation error analysis 
analysis as explained in Appendix B.4.1. 

Adjusting @ is an ideal trick for this simple problem, but when adding damping 
and nonlinear terms, we have no simple formula for the impact on œw, and therefore 
we cannot use the trick. 


1.4.4 Exact Discrete Solution 


Perhaps more important than the © = w + O(A??) result found above is the fact 
that we have an exact discrete solution of the problem: 


2 wAt 
u” = I cos (ÕnAt), & = — sin! | —]. 1.20 
Gran, =< sin! (2) (1.20) 
We can then compute the error mesh function 
e” = Ue(t,) — u” = I cos(@nAt) — I cos(@nAt) . (1.21) 
From the formula cos 2x — cos2y = —2 sin(x — y) sin(x + y) we can rewrite e” 
so the expression is easier to interpret: 
, 1 N I 1 x 
e” = —21 sin t= @-ő) sin t5 (w+) . (1.22) 


The error mesh function is ideal for verification purposes and you are strongly 
encouraged to make a test based on (1.20) by doing Exercise 1.11. 


1.4.5 Convergence 


We can use (1.19) and (1.21), or (1.22), to show convergence of the numerical 
scheme, i.e., e” — 0 as At — 0, which implies that the numerical solution ap- 
proaches the exact solution as At approaches to zero. We have that 


ae oe . 2 og [OAT 
lim © = lim — sin — ] =a, 
At—>0 Ato At 
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by L’Hopital’s rule. This result could also been computed WolframAlpha!?, or we 
could use the limit functionality in sympy: 


>>> import sympy as sym 

>>> dt, w = sym.symbols(’x w’) 

>>> sym. limit ((2/dt)*sym.asin(w*dt/2), dt, 0, dir=’+’) 
W 


Also (1.19) can be used to establish that © —> w when At — 0. It then follows 
from the expression(s) for e” that e” > 0. 


1.4.6 The Global Error 


To achieve more analytical insight into the nature of the global error, we can Taylor 
expand the error mesh function (1.21). Since &© in (1.18) contains At in the de- 
nominator we use the series expansion for @ inside the cosine function. A relevant 
sympy session is 


>>> from sympy import * 

>>> dt, W, t = symbols(’dt w t’) 

>>> w_tilde_e = 2/dt*asin(w*dt/2) 

>>> w_tilde_series = w_tilde_e.series(dt, 0, 4) 
>>> w_tilde_series 

w + dt**2*w**3/24 + O(dt**4) 


Series expansions in sympy have the inconvenient 0() term that prevents further 
calculations with the series. We can use the remove0() command to get rid of the 
0C) term: 


>>> w_tilde_series = w_tilde_series.remove0() 
>>> w_tilde_series 
dt**2*w**3/24 + wW 


Using this w_tilde_series expression for w in (1.21), dropping J (which is a 
common factor), and performing a series expansion of the error yields 


>>> error = cos(w*t) - cos(w_tilde_series*t) 
>>> error.series(dt, 0, 6) 
dt**2Q4tewk*3%sin(t*w)/24 + dt**det**2*wk*6*cos (t#w)/1152 + O(dt**6) 


Since we are mainly interested in the leading-order term in such expansions (the 
term with lowest power in At, which goes most slowly to zero), we use the 
.as_leading_term(dt) construction to pick out this term: 


10 http:/Avwww.wolframalpha.com/input/?i=%282 %2Fx %29* asin %28w*x %2F2%29+as+x-%3E0 
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>>> error.series(dt, 0, 6).as_leading_term(dt) 
dt**2*t*wt*3*sin (tw) /24 


The last result means that the leading order global (true) error at a point f is 
proportional to wt At”. Considering only the discrete t, values for t, t, is related 
to At through ¢,, = nAt. The factor sin(wt) can at most be 1, so we use this value 
to bound the leading-order expression to its maximum value 


1 
e" = — nw At. 
24 
This is the dominating term of the error at a point. 
We are interested in the accumulated global error, which can be taken as the £2 
norm of e”. The norm is simply computed by summing contributions from all mesh 
points: 


N, N 
n > 1 1 : 
|le [lee = At spn oar = spo ) n. 
n=0 n=0 


The sum D n? is approximately equal to iN?. Replacing N, by T/At and 
taking the square root gives the expression 


1 T? 
lle" lle = ANE 


This is our expression for the global (or integrated) error. A primary result from 
this expression is that the global error is proportional to Az’. 


1.4.7 Stability 


Looking at (1.20), it appears that the numerical solution has constant and correct 
amplitude, but an error in the angular frequency. A constant amplitude is not nec- 
essarily the case, however! To see this, note that if only Aż is large enough, the 
magnitude of the argument to sin! in (1.18) may be larger than 1, i.e., wAt/2 > 1. 
In this case, sin™!(wAt/2) has a complex value and therefore @ becomes com- 
plex. Type, for example, asin(x) in wolframalpha.com!! to see basic properties 
of sin™! (x). 

A complex & can be written © = @, + i@;. Since sin™! (x) has a negative 
imaginary part for x > 1, @; < 0, which means that e'®! = e~®''e!®! will lead to 
exponential growth in time because e~®'' with @; < 0 has a positive exponent. 


Stability criterion 
We do not tolerate growth in the amplitude since such growth is not present in 
the exact solution. Therefore, we must impose a stability criterion so that the 


l http://www.wolframalpha.com 


1.4 Analysis of the Numerical Scheme 27 


dt=0.3184 


Fig. 1.5 Growing, unstable solution because of a time step slightly beyond the stability limit 


argument in the inverse sine function leads to real and not complex values of @. 
The stability criterion reads 


2 
—<1 > At<K<H—. (1.23) 
w 


With œ = 27, At > 27! = 0.3183098861837907 will give growing solutions. 
Figure 1.5 displays what happens when At = 0.3184, which is slightly above the 
critical value: At = x7! + 9.01- 107°. 


1.4.8 About the Accuracy at the Stability Limit 


An interesting question is whether the stability condition At < 2/m is unfortu- 
nate, or more precisely: would it be meaningful to take larger time steps to speed 
up computations? The answer is a clear no. At the stability limit, we have that 
sin’! wAt/2 = sin! 1 = z/2, and therefore © = x/At. (Note that the ap- 
proximate formula (1.19) is very inaccurate for this value of A? as it predicts 
© = 2.34/pi, which is a 25 percent reduction.) The corresponding period of the 
numerical solution is P = 27 /@® = 2At, which means that there is just one time 
step At between a peak (maximum) and a through!” (minimum) in the numerical 
solution. This is the shortest possible wave that can be represented in the mesh! In 
other words, it is not meaningful to use a larger time step than the stability limit. 
Also, the error in angular frequency when At = 2/wm is severe: Figure 1.6 
shows a comparison of the numerical and analytical solution with œ = 27 and 


12 https://simple. wikipedia.org/wiki/Wave_(physics) 
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Fig. 1.6 Numerical solution with At exactly at the stability limit 


At = 2/w = 17!. Already after one period, the numerical solution has a through 
while the exact solution has a peak (!). The error in frequency when At is at the 
stability limit becomes w — © = w(1 — 1/2) © —0.57w. The corresponding error 
in the period is P — P = 0.36P. The error after m periods is then 0.36mP. This 
error has reached half a period when m = 1/(2- 0.36) ~ 1.38, which theoretically 
confirms the observations in Fig. 1.6 that the numerical solution is a through ahead 
of a peak already after one and a half period. Consequently, At should be chosen 
much less than the stability limit to achieve meaningful numerical computations. 


Summary 
From the accuracy and stability analysis we can draw three important conclu- 
sions: 


1. The key parameter in the formulas is p = wAt. The period of oscillations 
is P = 2z/qw, and the number of time steps per period is Np = P/At. 
Therefore, p = wAt = 22/Np, showing that the critical parameter is the 
number of time steps per period. The smallest possible Np is 2, showing that 
p € (0, 7]. 

2. Provided p < 2, the amplitude of the numerical solution is constant. 

3. The ratio of the numerical angular frequency and the exact one is @/w ~ 
1+ x p°. The error x p° leads to wrongly displaced peaks of the numer- 
ical solution, and the error in peak location grows linearly with time (see 
Exercise 1.2). 
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1.5 Alternative Schemes Based on 1st-Order Equations 

A standard technique for solving second-order ODEs is to rewrite them as a system 
of first-order ODEs and then choose a solution strategy from the vast collection of 
methods for first-order ODE systems. Given the second-order ODE problem 


u"+o7u=0, u(0) =/, u'(0) = 0, 


we introduce the auxiliary variable v = u’ and express the ODE problem in terms 
of first-order derivatives of u and v: 


ui =v, (1.24) 
v = wu. (1.25) 


The initial conditions become u(0) = J and v(0) = 0. 


1.5.1 The Forward Euler Scheme 


A Forward Euler approximation to our 2 x2 system of ODEs (1.24)-(1.25) becomes 


[Du = v]", (1.26) 
[Div = —o7 uJ", (1.27) 
or written out, 
ult! = u” + Atv", (1.28) 
pit! = y” — Ato u" . (1.29) 


Let us briefly compare this Forward Euler method with the centered difference 
scheme for the second-order differential equation. We have from (1.28) and (1.29) 
applied at levels n and n — 1 that 


yrrl = u” $ Atv” = u” $ At(v"! _ Atw*u"') ; 


Since from (1.28) 


1 
y”! = ay" _ u”7!), 


it follows that 
yer = 2u" — yr! _ Ato u", 


which is very close to the centered difference scheme, but the last term is evaluated 
at ¢,_; instead of t,. Rewriting, so that Atow u”! appears alone on the right-hand 
side, and then dividing by Az’, the new left-hand side is an approximation to u” at 
tn, while the right-hand side is sampled at ¢,_;. All terms should be sampled at the 
same mesh point, so using wu"! instead of wu” points to a kind of mathematical 
error in the derivation of the scheme. This error turns out to be rather crucial for the 
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accuracy of the Forward Euler method applied to vibration problems (Sect. 1.5.4 
has examples). 

The reasoning above does not imply that the Forward Euler scheme is not correct, 
but more that it is almost equivalent to a second-order accurate scheme for the 
second-order ODE formulation, and that the error committed has to do with a wrong 
sampling point. 


1.5.2 The Backward Euler Scheme 


A Backward Euler approximation to the ODE system is equally easy to write up in 
the operator notation: 


[Dru =v)", (1.30) 
[Dru = -ou)"*!. (1.31) 
This becomes a coupled system for u”*! and v"*!: 
u"tl— Ary"th =u", (1.32) 
v't! + Ato?u"t! =v", (1.33) 


We can compare (1.32)-(1.33) with the centered scheme (1.7) for the second- 
order differential equation. To this end, we eliminate v”*! in (1.32) using (1.33) 
solved with respect to v”*!. Thereafter, we eliminate v” using (1.32) solved with 
respect to v"*! and also replacing n + 1 by n and n by n — 1. The resulting equation 
involving only u"*!, u”, and u”~! can be ordered as 


u”+! — 2u” + yr! 
At? 


which has almost the same form as the centered scheme for the second-order dif- 
ferential equation, but the right-hand side is evaluated at u”*! and not u”. This 
inconsistent sampling of terms has a dramatic effect on the numerical solution, as 
we demonstrate in Sect. 1.5.4. 


= —w2y"t! : 


1.5.3 The Crank-Nicolson Scheme 


The Crank-Nicolson scheme takes this form in the operator notation: 
[Diu = 7], (1.34) 
[Div = -o T] +a, (1.35) 


Writing the equations out and rearranging terms, shows that this is also a coupled 
system of two linear equations at each time level: 


1 1 
u”t! _ Att = u” + zat", (1.36) 


II 


1 1 
Ta” zAto’u" t! v” — 5 Attu" : (1.37) 
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We may compare also this scheme to the centered discretization of the second- 
order ODE. It turns out that the Crank-Nicolson scheme is equivalent to the dis- 
cretization 


u”+! — 2u” + yr! 
At? 


1 
sga get + 2u” +u”! = —w°u" + O(At?). (1.38) 


That is, the Crank-Nicolson is equivalent to (1.7) for the second-order ODE, apart 
from an extra term of size At?, but this is an error of the same order as in the finite 
difference approximation on the left-hand side of the equation anyway. The fact 
that the Crank-Nicolson scheme is so close to (1.7) makes it a much better method 
than the Forward or Backward Euler methods for vibration problems, as will be 
illustrated in Sect. 1.5.4. 

Deriving (1.38) is a bit tricky. We start with rewriting the Crank-Nicolson equa- 
tions as follows 


ut! — y" = Targo +v”), (1.39) 
v”tl = y” — Taut +u”), (1.40) 
and add the latter at the previous time level as well: 
v =y! Taro" +u"). (1.41) 
We can also rewrite (1.39) at the previous time level as 


2 
v” 4 yr} = ae _ ul) . (1.42) 


Inserting (1.40) for v”*! in (1.39) and (1.41) for v” in (1.39) yields after some 
reordering: 


yer — u" = 5 (gaea 4 2u” 4 u”) a v” a wt) p 


Now, v” + v”! can be eliminated by means of (1.42). The result becomes 
1 
yer ss 2u” a: yr! = -Aw U 4 Qu" 4: 7 ma . (1.43) 

It can be shown that 

1 

ge” 4 2u” ma yg?) x u” a O(At’), 
meaning that (1.43) is an approximation to the centered scheme (1.7) for the second- 
order ODE where the sampling error in the term At?w7u” is of the same order as 
the approximation errors in the finite differences, i.e., O(A??). The Crank-Nicolson 


scheme written as (1.43) therefore has consistent sampling of all terms at the same 
time point f,. 
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1.5.4 Comparison of Schemes 


We can easily compare methods like the ones above (and many more!) with the aid 
of the Odespy!* package. Below is a sketch of the code. 


import odespy 
import numpy as np 


def f(u, t, w=1): 
# v, u numbering for EulerCromer to work well 
v, u=u #u is array of length 2 holding our iv. ul 
return [-w**2*u, v] 


def run_solvers_and_plot(solvers, timesteps_per_period=20, 
num_periods=1, I=1, w=2*np.pi): 
P = 2tnp.pi/w # duration of one period 
dt = P/timesteps_per_period 
Nt = num_periods*timesteps_per_period 
T = Nt*dt 
t_mesh = np.linspace(0, T, Nt+1) 


legends = [] 

for solver in solvers: 
solver.set (f_kwargs={’w’: wh) 
solver.set_initial_condition([0, I]) 
u, t = solver.solve(t_mesh) 


There is quite some more code dealing with plots also, and we refer to the source 
file vib_undamped_odespy.py for details. Observe that keyword arguments in 
f(u,t,w=1) can be supplied through a solver parameter f_kwargs (dictionary of 
additional keyword arguments to f). 

Specification of the Forward Euler, Backward Euler, and Crank-Nicolson 
schemes is done like this: 


solvers = [ 
odespy .ForwardEuler(f) , 
# Implicit methods must use Newton solver to converge 
odespy.BackwardEuler(f, nonlinear_solver=’Newton’), 
odespy.CrankNicolson(f, nonlinear_solver=’Newton’), 


The vib_undamped_odespy. py program makes two plots of the computed so- 
lutions with the various methods in the solvers list: one plot with u(t) versus t, 
and one phase plane plot where v is plotted against u. That is, the phase plane 
plot is the curve (u(t), v(t)) parameterized by t. Analytically, u = J cos(wt) and 
v =u’ = —a/ sin(@t). The exact curve (u(t), v(t)) is therefore an ellipse, which 
often looks like a circle in a plot if the axes are automatically scaled. The important 
feature, however, is that the exact curve (u(t), v(t)) is closed and repeats itself for 
every period. Not all numerical schemes are capable of doing that, meaning that the 
amplitude instead shrinks or grows with time. 


13 https://github.com/hplgit/odespy 
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Fig. 1.7 Comparison of classical schemes in the phase plane for two time step values 
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Fig. 1.8 Comparison of solution curves for classical schemes 


Figure 1.7 show the results. Note that Odespy applies the label MidpointImplicit 
for what we have specified as CrankNicolson in the code (CrankNicolson is 
just a synonym for class MidpointImplicit in the Odespy code). The Forward 
Euler scheme in Fig. 1.7 has a pronounced spiral curve, pointing to the fact that the 
amplitude steadily grows, which is also evident in Fig. 1.8. The Backward Euler 
scheme has a similar feature, except that the spriral goes inward and the amplitude 
is significantly damped. The changing amplitude and the spiral form decreases with 
decreasing time step. The Crank-Nicolson scheme looks much more accurate. In 
fact, these plots tell that the Forward and Backward Euler schemes are not suitable 
for solving our ODEs with oscillating solutions. 


1.5.5 Runge-Kutta Methods 


We may run two other popular standard methods for first-order ODEs, the 2nd- and 
4th-order Runge-Kutta methods, to see how they perform. Figures 1.9 and 1.10 
show the solutions with larger At values than what was used in the previous two 
plots. 


34 1 Vibration ODEs 


A Time step: 0.1 8 Time step: 0.05 
6 6 
4 4 
2 2 
50 =o 
-2l -2 
-4 -4 
e—e RK2 e—e RK2 
-6 ma RK4 -6| ma RK4 
— exact — exact 
-8. -8. 


=15 —1.0 —0.5 0.0 0.5 1.0 1.5 SLS —1.0 —0.5 0.0 0.5 1.0 15 
u(t) u(t) 


Fig. 1.9 Comparison of Runge-Kutta schemes in the phase plane 
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Fig. 1.10 Comparison of Runge-Kutta schemes 


The visual impression is that the 4th-order Runge-Kutta method is very accurate, 
under all circumstances in these tests, while the 2nd-order scheme suffers from 
amplitude errors unless the time step is very small. 

The corresponding results for the Crank-Nicolson scheme are shown in Fig. 1.11. 
It is clear that the Crank-Nicolson scheme outperforms the 2nd-order Runge-Kutta 
method. Both schemes have the same order of accuracy O(A?*), but their dif- 
ferences in the accuracy that matters in a real physical application is very clearly 
pronounced in this example. Exercise 1.13 invites you to investigate how the am- 
plitude is computed by a series of famous methods for first-order ODEs. 


1.5.6 Analysis of the Forward Euler Scheme 


We may try to find exact solutions of the discrete equations (1.28)—(1.29) in the 
Forward Euler method to better understand why this otherwise useful method has 
so bad performance for vibration ODEs. An “ansatz” for the solution of the discrete 
equations is 

u” = IA”, 

y” 2 q I A", 
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Fig. 1.11 Long-time behavior of the Crank-Nicolson scheme in the phase plane 


where q and A are scalars to be determined. We could have used a complex expo- 
nential form e'@”"“" since we get oscillatory solutions, but the oscillations grow in 
the Forward Euler method, so the numerical frequency @ will be complex anyway 
(producing an exponentially growing amplitude). Therefore, it is easier to just work 
with potentially complex A and q as introduced above. 

The Forward Euler scheme leads to 


A=1+Atq, 
A=1- Atoq. 


We can easily eliminate A, get q? + œ? = 0, and solve for 
q = tia, 
which gives 
A=1+Atio. 


We shall take the real part of A” as the solution. The two values of A are complex 
conjugates, and the real part of A” will be the same for both roots. This is easy to 
realize if we rewrite the complex numbers in polar form, which is also convenient 
for further analysis and understanding. The polar form re’? of a complex number 
x +iy hasr = yx? + y? and 6 = tan7!(y/x). Hence, the polar form of the two 
values for A becomes 


1+ Atio = V1 +o? Aret oA) | 


Now it is very easy to compute A”: 


a 4 Atiw)" = a +4 w? At?) Zetri tan” (At) , 


Since cos(6n) = cos(—0n), the real parts of the two numbers become the same. 
We therefore continue with the solution that has the plus sign. 

The general solution is u” = CA”, where C is a constant determined from the 
initial condition: u? = C = I. We have u” = IA” and v” = qI A”. The final 


36 1 Vibration ODEs 


solutions are just the real part of the expressions in polar form: 


"= I(1 + At?) cos(n tan! (wAt)), (1.44) 
v” = —wI(1 + w*At?)"”? sin(n tan“! (wAt)). (1.45) 


z 
ll 


The expression (1 + w?At?)"/? causes growth of the amplitude, since a number 
greater than one is raised to a positive exponent n/2. We can develop a series 
expression to better understand the formula for the amplitude. Introducing p = 
wAt as the key variable and using sympy gives 


>>> from sympy import * 

>>> p = symbols(’p’, real=True) 

>>> n = symbols(’n’, integer=True, positive=True) 
>>> amplitude = (1 + p**2)*«*(n/2) 

>>> amplitude.series(p, 0, 4) 

1 + n*p**2/2 + O(p**4) 


The amplitude goes like 1 + ino’ At, clearly growing linearly in time (with 7). 
We can also investigate the error in the angular frequency by a series expansion: 


>>> n*atan(p).series(p, 0, 4) 
n*(p - p**3/3 + O(p**4)) 


This means that the solution for u” can be written as 
1 1 
u” = (1 + zro At + oad) cos (or — zerat + oar) ; 


The error in the angular frequency is of the same order as in the scheme (1.7) for 
the second-order ODE, but the error in the amplitude is severe. 


1.6 Energy Considerations 


The observations of various methods in the previous section can be better inter- 
preted if we compute a quantity reflecting the total energy of the system. It turns out 
that this quantity, 
tna 1 22 

E(t) = 54) + nou, 
is constant for all t. Checking that E(t) really remains constant brings evidence 
that the numerical computations are sound. It turns out that Æ is proportional to the 
mechanical energy in the system. Conservation of energy is much used to check 
numerical simulations, so it is well invested time to dive into this subject. 


1.6.1 Derivation of the Energy Expression 


We start out with multiplying 


u” + œu = 0, 
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by u’ and integrating from 0 to T: 


T T 
fears [outa =0. 
0 0 
Observing that 
d 1 d 1 
ot ha \2 i 2 
Eee a 
we get 
r d 1 d 1 
A2 2,2 
ale — dt = E(T)— E(0 = 0, 
| (a0 + a5") (T) — EO) 
0 
where we have introduced 
lora e oea 
E(t) = 5) + 5” ur. (1.46) 


The important result from this derivation is that the total energy is constant: 
E(t) = E(0). 


E(t) is closely related to the system's energy 

The quantity E(t) derived above is physically not the mechanical energy of a 
vibrating mechanical system, but the energy per unit mass. To see this, we start 
with Newton’s second law F = ma (F is the sum of forces, m is the mass of the 
system, and a is the acceleration). The displacement u is related to a through 
a = u". With a spring force as the only force we have F = —ku, where k is a 
spring constant measuring the stiffness of the spring. Newton’s second law then 
implies the differential equation 


—ku = mu" => mu" +ku=0. 
This equation of motion can be turned into an energy balance equation by finding 


the work done by each term during a time interval [0, 7]. To this end, we multiply 
the equation by du = u'dt and integrate: 


T T 
[murat + f kuv'at =0. 
0 0 


E(t) = Ex(t) + E,(t) = 0, 


The result is 


where i 
E(t) = 5m ve, (1.47) 
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is the kinetic energy of the system, and 
1 
E,(t) = aku (1.48) 


is the potential energy. The sum E(t) is the total mechanical energy. The deriva- 
tion demonstrates the famous energy principle that, under the right physical 
circumstances, any change in the kinetic energy is due to a change in potential 
energy and vice versa. (This principle breaks down when we introduce damping 
in the system, as we do in Sect. 1.10.) 

The equation mu” +ku = 0 can be divided by m and written as u”+@?u = 0 
for @ = ./k/m. The energy expression E(t) = 4 (u)? + Sur derived earlier 
is then E(t)/m, i.e., mechanical energy per unit mass. 


Energy of the exact solution Analytically, we have u(t) = I cos wt, if u(0) = I 
and u'(0) = 0, so we can easily check the energy evolution and confirm that E(t) 
is constant: 


1 1 1 1 
E(t) = se sin wt)? + 50 T cos wt E 50 (sin’ ot + cos? wt) = 50 


Growth of energy in the Forward Euler scheme It is easy to show that the energy 
in the Forward Euler scheme increases when stepping from time level n ton + 1. 


1 1 
Et = ae 4 z2 ut) 


1 1 
= 50" — w Atu”)? 4 sO (u" + Atv”)? 


(1+ At?w’)E". 


1.6.2 An Error Measure Based on Energy 


The constant energy is well expressed by its initial value Æ (0), so that the error in 
mechanical energy can be computed as a mesh function by 


2 


1 yet — yr 1 
e} = ( TN ) + 50° U")? — E(0), n=1,...,N;—1, (1.49) 


1 1 
EO) = =V? See l. 
(0) 7 +52 


if u(0) = Z and u'(0) = V. Note that we have used a centered approximation to 
u's ul (ty) ~ [Dzu]". 

A useful norm of the mesh function e% for the discrete mechanical energy can 
be the maximum absolute value of e% : 


en = max ļe%]. 
lle% = max, |e} 
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Alternatively, we can compute other norms involving integration over all mesh 
points, but we are often interested in worst case deviation of the energy, and then 
the maximum value is of particular relevance. 

A vectorized Python implementation of e’, takes the form 


# import numpy as np and compute u, t 

dt = t[1]-t [0] 

E = 0.5*((u[2:] - ul:-2])/(2*dt))**2 + 0.5*w**2*u[1:-1] **2 
EO = 0.5*V**2 + 0.5%*we*2*T**2 

e_E = E - EO 

e_E_norm = np.abs(e_E) .max() 


The convergence rates of the quantity e_E_norm can be used for verification. 
The value of e_E_norm is also useful for comparing schemes through their ability 
to preserve energy. Below is a table demonstrating the relative error in total energy 
for various schemes (computed by the vib_undamped_odespy.py program). The 
test problem is u” + 42?u = 0 with u(0) = 1 and u’(0) = 0, so the period is 
1 and E(t) ~ 4.93. We clearly see that the Crank-Nicolson and the Runge-Kutta 
schemes are superior to the Forward and Backward Euler schemes already after one 
period. 


Method T At max Jer | /e. 
Forward Euler 1 0.025 1.678 - 10° 
Backward Euler 1 0.025 6.235- 107! 
Crank-Nicolson 1 0.025 1.221 - 10-7 
Runge-Kutta 2nd-order 1 0.025 6.076 - 10-3 
Runge-Kutta 4th-order 1 0.025 8.214- 10-3 


However, after 10 periods, the picture is much more dramatic: 


Method T At max Je’ | /e. 
Forward Euler 10 0.025 1.788 - 10+ 
Backward Euler 10 0.025 1.000 - 10° 
Crank-Nicolson 10 0.025 1,221 - 10-7 
Runge-Kutta 2nd-order 10 0.025 6.250 - 107? 
Runge-Kutta 4th-order 10 0.025 8.288 - 10-3 


The Runge-Kutta and Crank-Nicolson methods hardly change their energy error 
with T, while the error in the Forward Euler method grows to huge levels and a 
relative error of 1 in the Backward Euler method points to E(t) —> 0 as t grows 
large. 

Running multiple values of At, we can get some insight into the convergence of 
the energy error: 
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Method i At max Jet | /e. 


Forward Euler 10 0.05 1.120 - 108 
Forward Euler 10 0.025 1.788 - 104 
Forward Euler 10 0.0125 1.374 - 10? 
Backward Euler 10 0.05 1.000 - 10° 
Backward Euler 10 0.025 1.000 - 10° 
Backward Euler 10 0.0125 9.928 - 107! 
Crank-Nicolson 10 0.05 4.756 - 107? 
Crank-Nicolson 10 0.025 1.221 - 10-? 
Crank-Nicolson 10 0.0125 3.125 - 107° 
Runge-Kutta 2nd-order 10 0.05 6.152 - 107! 
Runge-Kutta 2nd-order 10 0.025 6.250 - 107? 
Runge-Kutta 2nd-order 10 0.0125 7.631 - 10-3 
Runge-Kutta 4th-order 10 0.05 3.510 - 10-7 
Runge-Kutta 4th-order 10 0.025 8.288 - 10-3 
Runge-Kutta 4th-order 10 0.0125 2.058 - 10-3 


A striking fact from this table is that the error of the Forward Euler method is re- 
duced by the same factor as Ar is reduced by, while the error in the Crank-Nicolson 
method has a reduction proportional to Ar? (we cannot say anything for the Back- 
ward Euler method). However, for the RK2 method, halving Af reduces the error 
by almost a factor of 10 (!), and for the RK4 method the reduction seems propor- 
tional to Ar? only (and the trend is confirmed by running smaller time steps, so for 
At = 3.9- 1074 the relative error of RK2 is a factor 10 smaller than that of RK4!). 


1.7 The Euler-Cromer Method 


While the Runge-Kutta methods and the Crank-Nicolson scheme work well for 
the vibration equation modeled as a first-order ODE system, both were inferior 
to the straightforward centered difference scheme for the second-order equation 
u!" + œu = 0. However, there is a similarly successful scheme available for the 
first-order system u’ = v, v = —w7u, to be presented below. The ideas of the 
scheme and their further developments have become very popular in particle and 
rigid body dynamics and hence are widely used by physicists. 


1.7.1 Forward-Backward Discretization 


The idea is to apply a Forward Euler discretization to the first equation and a Back- 
ward Euler discretization to the second. In operator notation this is stated as 


[D u = v]", (1.50) 
[D7v = —w7u]"*!. (1.51) 


t 
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We can write out the formulas and collect the unknowns on the left-hand side: 


u"t! = u” + Atv", (1.52) 


yay — Ato u"t! (1.53) 


We realize that after u”*! has been computed from (1.52), it may be used directly 
in (1.53) to compute v"" 

In physics, it is more common to update the v equation first, with a forward 
difference, and thereafter the u equation, with a backward difference that applies 
the most recently computed v value: 


= 


e 
ll 


n+l — y” — Atay", (1.54) 
ut! = u” + Aty™t! | (1.55) 


The advantage of ordering the ODEs as in (1.54)-(1.55) becomes evident when con- 
sidering complicated models. Such models are included if we write our vibration 
ODE more generally as 

u” + g(u,u’,t) =0. 


We can rewrite this second-order ODE as two first-order ODEs, 


f 


v = —g(u,v,t), 
/ 


u =v. 


This rewrite allows the following scheme to be used: 


n+l = y” — At g(u”, v”,t), 


yrt! = u” ae At pit ; 


e 
| 


We realize that the first update works well with any g since old values u” and v” are 
used. Switching the equations would demand u”*! and v"*! values in g and result 
in nonlinear algebraic equations to be solved at each time level. 

The scheme (1.54)—(1.55) goes under several names: forward-backward scheme, 
semi-implicit Euler method!*, semi-explicit Euler, symplectic Euler, Newton- 
Stormer-Verlet, and Euler-Cromer. We shall stick to the latter name. 

How does the Euler-Cromer method preserve the total energy? We may run the 
example from Sect. 1.6.2: 


Method Uf At max Je’, | /e°. 
Euler-Cromer 10 0.05 2.530 - 107? 
Euler-Cromer 10 0.025 6.206 - 10-3 
Euler-Cromer 10 0.0125 1.544- 10-3 


The relative error in the total energy decreases as A??, and the error level is slightly 
lower than for the Crank-Nicolson and Runge-Kutta methods. 


14 http://en.wikipedia.org/wiki/Semi-implicit_Euler_method 
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1.7.2 Equivalence with the Scheme for the Second-Order ODE 


We shall now show that the Euler-Cromer scheme for the system of first-order equa- 
tions is equivalent to the centered finite difference method for the second-order 
vibration ODE (!). 

We may eliminate the v” variable from (1.52)—(1.53) or (1.54)-(1.55). The v”+! 
term in (1.54) can be eliminated from (1.55): 


yrt! = u” + At(v" — œ? Atu”) š (1.56) 


The v” quantity can be expressed by u” and u”~! using (1.55): 
u” — u”! 


~ A o 


v” 


and when this is inserted in (1.56) we get 
u”t! = 2u” — u”! = Atow u”, (1.57) 


which is nothing but the centered scheme (1.7)! The two seemingly different numer- 
ical methods are mathematically equivalent. Consequently, the previous analysis of 
(1.7) also applies to the Euler-Cromer method. In particular, the amplitude is con- 
stant, given that the stability criterion is fulfilled, but there is always an angular 
frequency error (1.19). Exercise 1.18 gives guidance on how to derive the exact 
discrete solution of the two equations in the Euler-Cromer method. 

Although the Euler-Cromer scheme and the method (1.7) are equivalent, there 
could be differences in the way they handle the initial conditions. Let us look into 
this topic. The initial condition u’ = 0 means u’ = v = 0. From (1.54) we get 


v! = v? — Atw u? = Atw7u, 
and from (1.55) it follows that 
u! = u? + Atu! =u? -o Atul. 


When we previously used a centered approximation of u’(0) = 0 combined with 
the discretization (1.7) of the second-order ODE, we got a slightly different result: 
u! = u? — to Atu’. The difference is to? Atu’, which is of second order in 
At, seemingly consistent with the overall error in the scheme for the differential 
equation model. 

A different view can also be taken. If we approximate u’(0) = 0 by a backward 
difference, (u? — u-!)/At = 0, we get u~! = y°, and when combined with (1.7), 
it results in u! = u? — w?Art?u°. This means that the Euler-Cromer method based 
on (1.55)—(1.54) corresponds to using only a first-order approximation to the initial 
condition in the method from Sect. 1.1.2. 

Correspondingly, using the formulation (1.52)-(1.53) with v” = 0 leads to 
u! = u°, which can be interpreted as using a forward difference approximation for 
the initial condition u’(0) = 0. Both Euler-Cromer formulations lead to slightly dif- 


ferent values for u! compared to the method in Sect. 1.1.2. The error is to Atu’. 
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1.7.3 Implementation 


Solver function The function below, found in vib_undamped_EulerCromer. py, 
implements the Euler-Cromer scheme (1.54)—(1.55): 


import numpy as np 


def solver(I, w, dt, T): 
woe 
Solve v? = - w**2*u, u’=v for t in (0,T], u(0)=I and v(0)=0, 
by an Euler-Cromer method. 
nnn 
dt = float (dt) 
Nt = int (round(T/dt)) 
u = np.zeros(Nt+1) 
v = np.zeros(Nt+1) 


t = np.linspace(0, Nt*dt, Nt+1) 
v[0] = 0 
ufo] = I 


for n in range(0, Nt): 
v[n+1] = v[n] - dt*w**2*u[n] 
u[n+1] = u[n] + dt*v[n+1] 
return u, v, t 


Verification Since the Euler-Cromer scheme is equivalent to the finite difference 
method for the second-order ODE u” + œu = 0 (see Sect. 1.7.2), the performance 
of the above solver function is the same as for the solver function in Sect. 1.2. 
The only difference is the formula for the first time step, as discussed above. This 
deviation in the Euler-Cromer scheme means that the discrete solution listed in 
Sect. 1.4.4 is not a solution of the Euler-Cromer scheme! 

To verify the implementation of the Euler-Cromer method we can adjust v [1] 
so that the computer-generated values can be compared with the formula (1.20) 
from in Sect. 1.4.4. This adjustment is done in an alternative solver function, 
solver_ic_fix in vib_EulerCromer.py. Since we now have an exact solution 
of the discrete equations available, we can write a test function test_solver for 
checking the equality of computed values with the formula (1.20): 


def test_solver(): 
woe 
Test solver with fixed initial condition against 
equivalent scheme for the 2nd-order ODE u’’ + u = O. 
nun 
MS 2 yp SPOR. Tes 
dt = 2/w # longest possible time step 
uü, v, t = solver ic fix, v; Chay D) 
from vib_undamped import solver as solver2 # 2nd-order ODE 
u2, t2 = solver2(I, w, dt, T) 
error = np.abs(u - u2).max() 
tol = 1E-14 
assert error < tol 
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Another function, demo, visualizes the difference between the Euler-Cromer 
scheme and the scheme (1.7) for the second-oder ODE, arising from the mismatch 
in the first time level. 


Using Odespy The Euler-Cromer method is also available in the Odespy package. 
The important thing to remember, when using this implementation, is that we must 
order the unknowns as v and u, so the u vector at each time level consists of the 
velocity v as first component and the displacement u as second component: 


# Define ODE 
def f(u, t, w=1): 
v u=u 
return [-w**2*u, v] 


# Initialize solver 

r= 

w = 2*np.pi 

import odespy 

solver = odespy.EulerCromer(f, f_kwargs={’w’: w}) 
solver.set_initial_condition([0, I]) 


# Compute time mesh 

P = 24np.pi/w # duration of one period 
dt = P/timesteps_per_period 

Nt = num_periods*timesteps_per_period 

T = Nt*dt 

import numpy as np 

t_mesh = np.linspace(0, T, Nt+1) 


# Solve ODE 
u, t = solver.solve(t_mesh) 
u=ul:,1] # Extract displacement 


Convergence rates We may use the convergence_rates function in the file 
vib_undamped. py to investigate the convergence rate of the Euler-Cromer method, 
see the convergence_rate function in the file vib_undamped_EulerCromer . py. 
Since we could eliminate v to get a scheme for u that is equivalent to the finite 
difference method for the second-order equation in u, we would expect the con- 
vergence rates to be the same, i.e., r = 2. However, measuring the convergence 
rate of u in the Euler-Cromer scheme shows that r = 1 only! Adjusting the initial 
condition does not change the rate. Adjusting w, as outlined in Sect. 1.4.2, gives 
a 4th-order method there, while there is no increase in the measured rate in the 
Euler-Cromer scheme. It is obvious that the Euler-Cromer scheme is dramatically 
much better than the two other first-order methods, Forward Euler and Backward 
Euler, but this is not reflected in the convergence rate of u. 
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1.7.4 The Stormer-Verlet Algorithm 


Another very popular algorithm for vibration problems, especially for long time 
simulations, is the St6érmer-Verlet algorithm. It has become the method among 
physicists for molecular simulations as well as particle and rigid body dynamics. 

The method can be derived by applying the Euler-Cromer idea twice, in a sym- 
metric fashion, during the interval [¢,, t,+1]: 


solve v’ = —wu by a Forward Euler step in [f,, tp} f ] 


solve u’ = v by a Backward Euler step in [¢,,, ¢,, , 1 ] 
solve u’ = v by a Forward Euler step in [f,, , 1 fa] 


E 


solve v’ = —wu by a Backward Euler step in [t } 1; fn+1] 
2 


With mathematics, 


1 
pits — y” 
TA -_ —w'u", 
3 t 
yrts — u” 
— tts 
1 
zAt 
yrtl — ynts 1 
= yrs 
1 
zât 
n+l n+} 
v Zv a = —wy"t) 
1 
zât 


The two steps in the middle can be combined to 


u”+! — u” 


= ytts 
At , 
and consequently 
1 1 
"tz = yt 5 Atotu", (1.58) 
1 
ult! = u” + Aty"*2, (1.59) 
1 1l 

v”tl = yt? — x Aton" (1.60) 
Writing the last equation as v” = v" — 5Atwru" and using this v” in the first 


: . 1 1 : 
equation gives v”*+2 = v’~2—Atw7u", and the scheme can be written as two steps: 
1 1 
y"tz = y""2 — Ato’u", (1.61) 
1 
yer = u” ais Atu"*2, (1.62) 


which is nothing but straightforward centered differences for the 2 x 2 ODE system 
on a staggered mesh, see Sect. 1.8.1. We have thus seen that four different reason- 
ings (discretizing u” + œu directly, using Euler-Cromer, using Stomer-Verlet, and 
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using centered differences for the 2 x 2 system on a staggered mesh) all end up 
with the same equations! The main difference is that the traditional Euler-Cromer 
displays first-order convergence in Af (due to less symmetry in the way u and v are 
treated) while the others are O (At?) schemes. 

The most numerically stable scheme, with respect to accumulation of rounding 
errors, is (1.61)—(1.62). It has, according to [6], better properties in this regard than 
the direct scheme for the second-order ODE. 


1.8 Staggered Mesh 


A more intuitive discretization than the Euler-Cromer method, yet equivalent, em- 
ploys solely centered differences in a natural way for the 2 x 2 first-order ODE 
system. The scheme is in fact fully equivalent to the second-order scheme for 
u” + wu = 0, also for the first time step. Such a scheme needs to operate on 
a staggered mesh in time. Staggered meshes are very popular in many physical 
application, maybe foremost fluid dynamics and electromagnetics, so the topic is 
important to learn. 


1.8.1 The Euler-Cromer Scheme on a Staggered Mesh 


In a staggered mesh, the unknowns are sought at different points in the mesh. 
Specifically, u is sought at integer time points t, and v is sought at f,41/2 between 
two u points. The unknowns are then u!, v>/?, v7, v>/?, and so on. We typically use 
the notation u” and v"*2 for the two unknown mesh functions. Figure 1.12 presents 
a graphical sketch of two mesh functions u and v on a staggered mesh. 

On a staggered mesh it is natural to use centered difference approximations, 


expressed in operator notation as 


[Du = v]"*2, (1.63) 
[D,v = —w7u)"*, (1.64) 


or if we switch the sequence of the equations: 


[D;v = —w7u]", (1.65) 
[Du = vy"? . (1.66) 
Writing out the formulas gives 
v"t? = y""2 — Ato u", (1.67) 
unt! = u” + Atutt?, (1.68) 


We can eliminate the v values and get back the centered scheme based on the 
second-order differential equation u” + w*u = 0, so all these three schemes are 
equivalent. However, they differ somewhat in the treatment of the initial conditions. 
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Fig. 1.12 Examples on mesh functions on a staggered mesh in time 
Suppose we have u(0) = J and u'(0) = v(0) = 0 as mathematical initial 


conditions. This means u? = J and 


v(0) ~x 5 (3 + v3) =0, > yd =-v 


Using the discretized equation (1.67) forn = 0 yields 
v2 = v`? — Atol, 


1 : 
= —v? results in 


N= 


and eliminating v7 
1 

v? = = Ataf, 
2 


and 


1 
u! = u? — 5AP oI, 


N= 


which is exactly the same equation for u! as we had in the centered scheme based 
on the second-order differential equation (and hence corresponds to a centered dif- 
ference approximation of the initial condition for u’(0)). The conclusion is that a 
staggered mesh is fully equivalent with that scheme, while the forward-backward 


version gives a slight deviation in the computation of u!. 
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We can redo the derivation of the initial conditions when u’(0) = V: 
Le 1 “1 1 
wO © 5 (v P+vi)=V, > v2=2V—-v?. 


; ; 1 
Using this v~2 in 


u? =T, (1.69) 


1 


1 
vi =V-— x Alor. (1.70) 


1.8.2 Implementation of the Scheme on a Staggered Mesh 
The algorithm goes like this: 


1. Set the initial values (1.69) and (1.70). 
2. Forn = 1,2,...: 

(a) Compute u” from (1.68). 

(b) Compute v"+3 from (1.67). 


Implementation with integer indices Translating the schemes (1.68) and (1.67) 
to computer code faces the problem of how to store and access vrtz, since arrays 
only allow integer indices with base 0. We must then introduce a convention: pits 
is stored in v [n] while v!~? is stored in v[n-1]. We can then write the algorithm 
in Python as 


def solver(I, w, dt, T): 
dt = float (dt) 
Nt = int (round(T/dt)) 


u = zeros (Ntt1) 

v = zeros (Nt+1) 

t = linspace(0, Nt*dt, Nt+1) # mesh for u 
t_v = t + dt/2 # mesh for v 
TONESATT 

v[0] = 0 - 0.5*dt*w*xx2*u [0] 


for n in range(1, Nt+1): 
u[n] = u[n-1] + dt*v[n-1] 
v[n] = v[n-1] - dt*w**2*u[n] 
rotura u; ty w, ÈV 


Note that u and v are returned together with the mesh points such that the complete 
mesh function for u is described by u and t, while v and t_v represent the mesh 
function for v. 
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Implementation with half-integer indices Some prefer to see a closer relation- 
ship between the code and the mathematics for the quantities with half-integer 
indices. For example, we would like to replace the updating equation for v [n] 


by 


v[n+thalf] = v[m-half] - dt*w**2*u[n] 


This is easy to do if we could be sure that n+half means n and n-half means n-1. 
A possible solution is to define half as a special object such that an integer plus 
half results in the integer, while an integer minus half equals the integer minus 1. 
A simple Python class may realize the half object: 


class HalfInt: 
def _radd__(self, other): 
return other 


def _rsub__(self, other): 
return other - 1 


half = HalfInt() 


The __radd__ function is invoked for all expressions nthalf ("right add" with 
self as half and other as n). Similarly, the __rsub__ function is invoked for 
n-half and results in n-1. 

Using the half object, we can implement the algorithms in an even more read- 
able way: 


def solver(I, w, dt, T): 
nan 
Solve u’=v, v? = - w**2#u for t in (0,T], u(0)=I and v(0)=0, 
by a central finite difference method with time step dt on 
a staggered mesh with v as unknown at (i+1/2)*dt time points. 
won 
dt = float (dt) 
Nt = int (round(T/dt)) 
u = zeros(Nt+1) 
v = zeros (Nt+1) 


t = linspace(O, Nt*dt, Nt+1) # mesh for u 
t_v=t + dt/2 # mesh for v 
u[0] = I 


v[0+half] = 0 - 0.5*dt*w**2*u[0] 
for n in range(1, Nt+1): 
u[n] = u[n-1] + dt*v[n-half] 
v[n+half] = v[n-half] - dt*w**2*u[n] 
return u, t, viil t_v[:-1] 


Verification of this code is easy as we can just compare the computed u with 
the u produced by the solver function in vib_undamped.py (which solves 
u” + œu = 0 directly). The values should coincide to machine precision since 
the two numerical methods are mathematically equivalent. We refer to the file 
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vib_undamped_staggered. py for the details of a unit test (test_staggered) 
that checks this property. 


1.9 Exercises and Problems 


Problem 1.1: Use linear/quadratic functions for verification 
Consider the ODE problem 


u” +u = f(t), u(0) =I, v0) = V, t € (0,T]. 


a) Discretize this equation according to [D;D;u + w*u = f]” and derive the 
equation for the first time step (u!). 

b) For verification purposes, we use the method of manufactured solutions 
(MMS) with the choice of ue(t) = ct + d. Find restrictions on c and 
d from the initial conditions. Compute the corresponding source term f. 
Show that [D,;D,t]" = O and use the fact that the D; D, operator is linear, 
[D,D,(ct + d)|" = c[D,D,t]" + [D;D,d]" = 0, to show that ue is also a 
perfect solution of the discrete equations. 

c) Use sympy to do the symbolic calculations above. Here is a sketch of the pro- 
gram vib_undamped_verify_mms. py: 


import sympy as sym 
V, t, I, w, dt = sym.symbols(’V t I w dt’) # global symbols 
f = None # global variable for the source term in the ODE 


def ode_source_term(u): 
"""Return the terms in the ODE that the source term 
must balance, here u’’ + w**2*u. 
u is symbolic Python function of t.""" 
return sym.diff(u(t), t, t) + w**2*u(t) 


def residual_discrete_eq(u): 
"""Return the residual of the discrete eq. with u inserted.""" 
1S cee 
return sym.simplify(R) 


def residual_discrete_eq_stepi(u): 
"""Return the residual of the discrete eq. at the first 
step with u inserted.""" 
BS ges 
return sym.simplify(R) 


def DtDt(u, dt): 
"""Return 2nd-order finite difference for u_tt. 
u is a symbolic Python function of t. 


mun 


return ... 
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def main(u): 
woe 


Given some chosen solution u (as a function of t, implemented 
as a Python function), use the method of manufactured solutions 
to compute the source term f, and check if u also solves 


the discrete equations. 
nun 


print ’=== Testing exact solution: 4s ===’ % u 
print "Initial conditions u(0)=%s, u’(0)=%s:" % \ 
(u(t).subs(t, 0), sym.diff(u(t), t).subs(t, 0)) 


# Method of manufactured solution requires fitting f 
global f # source term in the ODE 
f = sym.simplify(ode_lhs(u)) 


# Residual in discrete equations (should be 0) 
print ’residual stepi:’, residual_discrete_eq_step1(u) 
print ’residual:’, residual_discrete_eq(u) 


def linear(): 
main(lambda t: V*t + I) 


if name == ?__main__’°: 


linear () 


Fill in the various functions such that the calls in the main function works. 

d) The purpose now is to choose a quadratic function ue = bt? + ct + d as exact 
solution. Extend the sympy code above with a function quadratic for fitting f 
and checking if the discrete equations are fulfilled. (The function is very similar 
to linear.) 

e) Will a polynomial of degree three fulfill the discrete equations? 

f) Implement a solver function for computing the numerical solution of this prob- 
lem. 

g) Write a test function for checking that the quadratic solution is computed cor- 
rectly (to machine precision, but the round-off errors accumulate and increase 
with T) by the solver function. 


Filename: vib_undamped_verify_mms. 


Exercise 1.2: Show linear growth of the phase with time 

Consider an exact solution J cos(wt) and an approximation J cos(@r). Define the 
phase error as the time lag between the peak Z in the exact solution and the corre- 
sponding peak in the approximation after m periods of oscillations. Show that this 
phase error is linear in m. 

Filename: vib_phase_error_growth. 


Exercise 1.3: Improve the accuracy by adjusting the frequency 

According to (1.19), the numerical frequency deviates from the exact frequency by 
a (dominating) amount w* At?/24 > 0. Replace the w parameter in the algorithm 
in the solver function in vib_undamped. py by w*(1 - (1./24) *wk*2*dt**2 
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and test how this adjustment in the numerical algorithm improves the accuracy (use 
At = 0.1 and simulate for 80 periods, with and without adjustment of w). 
Filename: vib_adjust_w. 


Exercise 1.4: See if adaptive methods improve the phase error 

Adaptive methods for solving ODEs aim at adjusting A? such that the error is within 
a user-prescribed tolerance. Implement the equation u” + u = 0 in the Odespy! 
software. Use the example from Section 3.2.11 in [9]. Run the scheme with a very 
low tolerance (say 1074) and for a long time, check the number of time points in 
the solver’s mesh (len(solver.t_all)), and compare the phase error with that 
produced by the simple finite difference method from Sect. 1.1.2 with the same 
number of (equally spaced) mesh points. The question is whether it pays off to use 
an adaptive solver or if equally many points with a simple method gives about the 
same accuracy. 

Filename: vib_undamped_adaptive. 


Exercise 1.5: Use a Taylor polynomial to compute u! 
As an alternative to computing u! by (1.8), one can use a Taylor polynomial with 
three terms: 


u(t;) ~ u(0) + u’/(0)At + SAP. 


With u” = —w*u and u'(0) = 0, show that this method also leads to (1.8). 
Generalize the condition on u’(0) to be u’(0) = V and compute u! in this case with 
both methods. 

Filename: vib_first_step. 


Problem 1.6: Derive and investigate the velocity Verlet method 
The velocity Verlet method for u” + w*u = 0 is based on the following ideas: 


1. step u forward from f¢, to f,.; using a three-term Taylor series, 
2. replace u” by —w?u 
3. discretize v’ = —w?u by a Crank-Nicolson method. 


Derive the scheme, implement it, and determine empirically the convergence rate. 


Problem 1.7: Find the minimal resolution of an oscillatory function 

Sketch the function on a given mesh which has the highest possible frequency. That 
is, this oscillatory “cos-like” function has its maxima and minima at every two grid 
points. Find an expression for the frequency of this function, and use the result 
to find the largest relevant value of wAt when œ is the frequency of an oscillating 
function and At is the mesh spacing. 

Filename: vib_largest_wdt. 


15 https://github.com/hplgit/odespy 
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Exercise 1.8: Visualize the accuracy of finite differences for a cosine function 
We introduce the error fraction 


— [D,D;u]" 
u” (tn) 


to measure the error in the finite difference approximation D, D;u to u”. Compute 
E for the specific choice of a cosine/sine function of the form u = exp (iœt) and 


show that 2 
2 At 
E = | — ) sin’? 2a . 
wAt 2 


Plot E as a function of p = wAt. The relevant values of p are [0,7] (see 
Exercise 1.7 for why p > m does not make sense). The deviation of the curve 
from unity visualizes the error in the approximation. Also expand E as a Taylor 
polynomial in p up to fourth degree (use, e.g., sympy). 

Filename: vib_plot_fd_exp_error. 


Exercise 1.9: Verify convergence rates of the error in energy 

We consider the ODE problem u” + w?u = 0, u(0) = I, u'(0) = V, fort € (0, T]. 
The total energy of the solution E(t) = 1 (u' )⁄2 + to?u? should stay constant. The 
error in energy can be computed as explained in Sect. 1.6. 

Make a test function in a separate file, where code from vib_undamped. py is 
imported, but the convergence_ratesand test_convergence_rates functions 
are copied and modified to also incorporate computations of the error in energy and 
the convergence rate of this error. The expected rate is 2, just as for the solution 
itself. 

Filename: test_error_conv. 


Exercise 1.10: Use linear/quadratic functions for verification 

This exercise is a generalization of Problem 1.1 to the extended model problem 
(1.71) where the damping term is either linear or quadratic. Solve the various 
subproblems and see how the results and problem settings change with the gen- 
eralized ODE in case of linear or quadratic damping. By modifying the code from 
Problem 1.1, sympy will do most of the work required to analyze the generalized 
problem. 

Filename: vib_verify_mms. 


Exercise 1.11: Use an exact discrete solution for verification 

Write a test function in a separate file that employs the exact discrete solution (1.20) 
to verify the implementation of the solver function in the file vib_undamped. py. 
Filename: test_vib_undamped_exact_discrete_sol. 


Exercise 1.12: Use analytical solution for convergence rate tests 

The purpose of this exercise is to perform convergence tests of the problem (1.71) 
when s(u) = cu, F(t) = Asin@gt and there is no damping. Find the complete 
analytical solution to the problem in this case (most textbooks on mechanics or 
ordinary differential equations list the various elements you need to write down the 
exact solution, or you can use symbolic tools like sympy or wolframalpha. com). 
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Fig. 1.13 The amplitude as it changes over 100 periods for RK3 and RK4 


Modify the convergence_rate function from the vib_undamped. py program to 
perform experiments with the extended model. Verify that the error is of order At”. 
Filename: vib_conv_rate. 


Exercise 1.13: Investigate the amplitude errors of many solvers 

Use the program vib_undamped_odespy.py from Sect. 1.5.4 (utilize the func- 
tion amplitudes) to investigate how well famous methods for Ist-order ODEs 
can preserve the amplitude of u in undamped oscillations. Test, for example, 
the 3rd- and 4th-order Runge-Kutta methods (RK3, RK4), the Crank-Nicolson 
method (CrankNicolson), the 2nd- and 3rd-order Adams-Bashforth methods 
(AdamsBashforth2, AdamsBashforth3), and a 2nd-order Backwards scheme 
(Backward2Step). The relevant governing equations are listed in the beginning of 
Sect. 1.5. 

Running the code, we get the plots seen in Fig. 1.13, 1.14, and 1.15. They show 
that RK4 is superior to the others, but that also CrankNicolson performs well. In 
fact, with RK4 the amplitude changes by less than 0.1 per cent over the interval. 
Filename: vib_amplitude_errors. 


Problem 1.14: Minimize memory usage of a simple vibration solver 

We consider the model problem u” + w?u = 0, u(0) = I, u/(0) = V, solved 
by a second-order finite difference scheme. A standard implementation typically 
employs an array u for storing all the u” values. However, at some time level n+1 
where we want to compute u[n+1], all we need of previous u values are from level 
n and n-1. We can therefore avoid storing the entire array u, and instead work 
with u [n+1], u[n] , and u[n-1], named as u, u_n, u_nmp1, for instance. Another 
possible naming convention is u, u_n[0], u_n[-1]. Store the solution in a file 
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Fig. 1.14 The amplitude as it changes over 100 periods for Crank-Nicolson and Backward 2 step 
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Fig. 1.15 The amplitude as it changes over 100 periods for Adams-Bashforth 2 and 3 


for later visualization. Make a test function that verifies the implementation by 
comparing with the another code for the same problem. 
Filename: vib_memsave0. 
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Problem 1.15: Minimize memory usage of a general vibration solver 

The program vib.py stores the complete solution u°, u!,..., u^! in memory, 
which is convenient for later plotting. Make a memory minimizing version of this 
program where only the last three u”™!, u”, and u”~! values are stored in memory 
under the names u, u_n, and u_nm1 (this is the naming convention used in this 
book). Write each computed (f,.;,u"*!) pair to file. Visualize the data in the file 
(a cool solution is to read one line at a time and plot the u value using the line- 
by-line plotter in the visualize_front_ascii function - this technique makes it 
trivial to visualize very long time simulations). 

Filename: vib_memsave. 


Exercise 1.16: Implement the Euler-Cromer scheme for the generalized model 
We consider the generalized model problem 


mu" + f(u’) + slu) = F(t), uO)=7, w0) =V. 


a) Implement the Euler-Cromer method from Sect. 1.10.8. 

b) We expect the Euler-Cromer method to have first-order convergence rate. Make 
a unit test based on this expectation. 

c) Consider a system with m = 4, f(v) = b|v|v, b = 0.2, s = 2u, F = 0. 
Compute the solution using the centered difference scheme from Sect. 1.10.1 
and the Euler-Cromer scheme for the longest possible time step At. We can use 
the result from the case without damping, i.e., the largest At = 2/w, œ ~ J0.5 
in this case, but since b will modify the frequency, we take the longest possible 
time step as a safety factor 0.9 times 2/@. Refine At three times by a factor of 
two and compare the two curves. 


Filename: vib_EulerCromer. 


Problem 1.17: Interpret [D; D;u]” as a forward-backward difference 

Show that the difference [D; D;u]” is equal to [D} D;u]" and D7 D} u]”. That is, 
instead of applying a centered difference twice one can alternatively apply a mixture 
of forward and backward differences. 

Filename: vib_DtDt_fw_bw. 


Exercise 1.18: Analysis of the Euler-Cromer scheme 

The Euler-Cromer scheme for the model problem u"+@?u = 0, u(0) = J, u’ (0) = 
0, is given in (1.55)-(1.54). Find the exact discrete solutions of this scheme and 
show that the solution for u” coincides with that found in Sect. 1.4. 


Hint Use an “ansatz” u” = I exp(i@Atn) and v” = qu”, where © and q are 
unknown parameters. The following formula is handy: 


foe oe DAt 
eit 4 e SCAN L 2 = 2 (cosh(i@At) — 1) = —4 sin? () l 
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1.10 Generalization: Damping, Nonlinearities, and Excitation 


We shall now generalize the simple model problem from Sect. 1.1 to include a 
possibly nonlinear damping term f(u’), a possibly nonlinear spring (or restoring) 
force s(u), and some external excitation F (t): 


mu" + f(u) + slu) = F(t), u(0)= I, u'(0)= V, t € (0,T]. (1.71) 


We have also included a possibly nonzero initial value for w’ (0). The parameters m, 
fw’), s(u), F(t), I, V, and T are input data. 

There are two main types of damping (friction) forces: linear f(u’) = bu, or 
quadratic f(u') = bu'|u’|. Spring systems often feature linear damping, while air 
resistance usually gives rise to quadratic damping. Spring forces are often linear: 
s(u) = cu, but nonlinear versions are also common, the most famous is the gravity 
force on a pendulum that acts as a spring with s(w) ~ sin(u). 


1.10.1 A Centered Scheme for Linear Damping 


Sampling (1.71) at a mesh point t„, replacing u’(t,) by [D;D;u]", and u’(t,) by 
[D2,;u]" results in the discretization 


[mD,D,u + f(Duu) + stu) = FY", (1.72) 
which written out means 


u”+! — 2u” + yr! ‘ f u”+! —u 
At? 2At 


n-1 
) +s(u") = F”, (1.73) 


where F” as usual means F(t) evaluated at £ = ¢,. Solving (1.73) with respect 
to the unknown u”*! gives a problem: the u”+! inside the f function makes the 
equation nonlinear unless f(u’) is a linear function, f(u’) = bu’. For now we shall 
assume that f is linear in u’. Then 


u”+! — 2u” + yr! yori e yr! 
+b 


aw dae 1.74 
At? 2At ts) ( ) 


which gives an explicit formula for u at each new time level: 
n+l n b n—1 2 n n b p 
u"t! = | 2mu” + gAt—m u"— + At (F” — s(u")) m+ 5At i 
(1.75) 


For the first time step we need to discretize u'(0) = V as [Dau = V]° and 
combine with (1.75) for n = 0. The discretized initial condition leads to 


u`! = u! —2AtV, (1.76) 


which inserted in (1.75) for n = 0 gives an equation that can be solved for u!: 


At? 
ui =u + AtV + 5 CEV —s(u!) + F°). (1.77) 
m 
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1.10.2 A Centered Scheme for Quadratic Damping 


When f(u’) = bu'|u'|, we get a quadratic equation for u"*! in (1.73). This equa- 
tion can be straightforwardly solved by the well-known formula for the roots of a 
quadratic equation. However, we can also avoid the nonlinearity by introducing an 
approximation with an error of order no higher than what we already have from 
replacing derivatives with finite differences. 

We start with (1.71) and only replace u” by D, D;u, resulting in 


[mD,D,u + bu'|u'| + s(u) = FI”. (1.78) 


Here, u’|u'| is to be computed at time ¢,,. The idea is now to introduce a geometric 
mean, defined by 
2yr x w”? nta 


(w w 


for some quantity w depending on time. The error in the geometric mean approxi- 
mation is O(A??), the same as in the approximation u” ~ D;D,u. With w = u' it 
follows that 

[w ~ u'i pet, 


The next step is to approximate u’ at tn+1/2, and fortunately a centered difference 
fits perfectly into the formulas since it involves u values at the mesh points only. 
With the approximations 


/ n 1 n=} 
u (tn41/2) x [Du] +2, u'(ta—1/2) xX [Du] a (1.79) 
we get 
yet — u” ju” — | 
u'i? ~ [Diu]? |[Diu]"2| = 1.80 
[Wu ~ [Di Da] = mMM (1.80) 
The counterpart to (1.73) is then 
yor! —2u" + yr} u”+! — u” ju” _ u”~!]| 
b H s= F”, 1.81 
ý AP are ge E a 
which is linear in the unknown u”*!. Therefore, we can easily solve (1.81) with 


n+l 


respect to u”™? and achieve the explicit updating formula 


yr = (m J blu" _ u”) 


x (2mu" —mu" + bu” |u” — u”™!| + APF? — s(u”))) . (1.82) 


In the derivation of a special equation for the first time step we run into some 
trouble: inserting (1.76) in (1.82) for n = O results in a complicated nonlinear 
equation for u!. By thinking differently about the problem we can easily get away 
with the nonlinearity again. We have for n = 0 that b[u’|u’|]° = bV |V |. Using this 
value in (1.78) gives 


[mD;D;u +bV|V| + s(u) = FP. (1.83) 


1.10 Generalization: Damping, Nonlinearities, and Excitation 59 


Writing this equation out and using (1.76) results in the special equation for the first 


time step: 
2 


At 
u! =u? + AtV + 5, (OV IV|— s(t) + F°). (1.84) 
m 


1.10.3 A Forward-Backward Discretization of the Quadratic 
Damping Term 


The previous section first proposed to discretize the quadratic damping term |u’|u’ 
using centered differences: [|D2,;|D2,u]". As this gives rise to a nonlinearity in 
ut! it was instead proposed to use a geometric mean combined with centered 
differences. But there are other alternatives. To get rid of the nonlinearity in 
[| D2,;|D2,u]”, one can think differently: apply a backward difference to |u’|, such 
that the term involves known values, and apply a forward difference to u’ to make 
the term linear in the unknown u”+!. With mathematics, 
u” — n=l 


[Blu lu" ~ BILD, u" [Diu] = B | —— 


(1.85) 


The forward and backward differences both have an error proportional to At so 
one may think the discretization above leads to a first-order scheme. However, 
by looking at the formulas, we realize that the forward-backward differences in 
(1.85) result in exactly the same scheme as in (1.81) where we used a geometric 
mean and centered differences and committed errors of size O(At?). Therefore, 
the forward-backward differences in (1.85) act in a symmetric way and actually 
produce a second-order accurate discretization of the quadratic damping term. 


1.10.4 Implementation 


The algorithm arising from the methods in Sect.s 1.10.1 and 1.10.2 is very similar to 
the undamped case in Sect. 1.1.2. The difference is basically a question of different 
formulas for u! and u”*!. This is actually quite remarkable. The equation (1.71) 
is normally impossible to solve by pen and paper, but possible for some special 
choices of F, s, and f. On the contrary, the complexity of the nonlinear generalized 
model (1.71) versus the simple undamped model is not a big deal when we solve 
the problem numerically! 
The computational algorithm takes the form 


ead 

. compute u! from (1.77) if linear damping or (1.84) if quadratic damping 

3. forn = 1,2,...,N,—1: 

(a) compute u”+! from (1.75) if linear damping or (1.82) if quadratic damping 


Noe 


Modifying the solver function for the undamped case is fairly easy, the big differ- 
ence being many more terms and if tests on the type of damping: 
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def solver(I, V, m, b, s, F, dt, T, damping=’linear’): 
nun 
Solve mei + f(u’) + s(u) = F(t) for t in (0,T], 
u(0)=I and u’(0)=V, 
by a central finite difference method with time step dt. 
If damping is ’linear’, f(u’)=b*u, while if damping is 
>quadratic’, f(u’)=b*u’*abs(u’). 
F(t) and s(u) are Python functions. 


dt = float(dt); b = float(b); m = float(m) # avoid integer div. 
Nt = int (round(T/dt) ) 
u = np.zeros(Nt+1) 
t = np.linspace(0, Nt*dt, Nt+1) 
uf0] = I 
if damping == ’linear’: 
u[1] = u[0] + dt*V + dt**2/(2*m)*(-b*V - s(u[0]) + F(t[0])) 
elif damping == ’quadratic’: 


uli] = u[0] + dt*V + \ 
dt**2/(2*m) *(-b*V*abs(V) - s(u[0]) + F(t[0])) 


for n in range(1, Nt): 
if damping == ’linear’: 
u[n+1] = (2*m*u[n] + (b*dt/2 - m)*u[n-1] + 
dt**2*(F(t{n]) - s(u[n])))/(m + bx*dt/2) 
elif damping == ’quadratic’: 
u[n+1] = (2*m*u[n] - m*u[n-1] + b*u[n]*abs(u[n] - u[n-1]) 
+ dt**2*(F(t[n]) - s(u[m])))/\ 
(m + b*abs(u[n] - u[n-1])) 
return u, t 


The complete code resides in the file vib. py. 


1.10.5 Verification 


Constant solution For debugging and initial verification, a constant solution is 
often very useful. We choose ue(t) = I, which implies V = 0. Inserted in the 
ODE, we get F(t) = s(/) for any choice of f. Since the discrete derivative of 
a constant vanishes (in particular, [Dz,/]" = 0, [D;J]" = 0, and [D,D,/]" = 
0), the constant solution also fulfills the discrete equations. The constant should 
therefore be reproduced to machine precision. The function test_constant in 
vib. py implements this test. 


Linear solution Now we choose a linear solution: ue = ct + d. The initial 
condition u(0) = J implies d = 7, and u/(0) = V forces c to be V. Inserting 
Ue = Vt + I in the ODE with linear damping results in 

0+bV 4+s(Vt+/) = F(t), 


while quadratic damping requires the source term 


O+D/VIV +sVt +1) = F(t). 
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Since the finite difference approximations used to compute u’ all are exact for a lin- 
ear function, it turns out that the linear ue is also a solution of the discrete equations. 
Exercise 1.10 asks you to carry out all the details. 


Quadratic solution Choosing ue = bt? + Vt + 1, with b arbitrary, fulfills the 
initial conditions and fits the ODE if F is adjusted properly. The solution also solves 
the discrete equations with linear damping. However, this quadratic polynomial in 
t does not fulfill the discrete equations in case of quadratic damping, because the 
geometric mean used in the approximation of this term introduces an error. Doing 
Exercise 1.10 will reveal the details. One can fit F” in the discrete equations such 
that the quadratic polynomial is reproduced by the numerical method (to machine 
precision). 


Catching bugs How good are the constant and quadratic solutions at catching bugs 
in the implementation? Let us check that by introducing some bugs. 


e Use m instead of 2*m in the denominator of u[1]: code works for constant solu- 
tion, but fails (as it should) for a quadratic one. 

e Use b*dt instead of b*dt/2 in the updating formula for u[n+1] in case of linear 
damping: constant and quadratic both fail. 

e Use F[n+1] instead of F [n] in case of linear or quadratic damping: constant 
solution works, quadratic fails. 


We realize that the constant solution is very useful for catching certain bugs because 
of its simplicity (easy to predict what the different terms in the formula should 
evaluate to), while the quadratic solution seems capable of detecting all (?) other 
kinds of typos in the scheme. These results demonstrate why we focus so much on 
exact, simple polynomial solutions of the numerical schemes in these writings. 


1.10.6 Visualization 


The functions for visualizations differ significantly from those in the undamped 
case in the vib_undamped. py program because, in the present general case, we 
do not have an exact solution to include in the plots. Moreover, we have no good 
estimate of the periods of the oscillations as there will be one period determined by 
the system parameters, essentially the approximate frequency ,/s’(0)/m for linear 
s and small damping, and one period dictated by F(t) in case the excitation is 
periodic. This is, however, nothing that the program can depend on or make use of. 
Therefore, the user has to specify T and the window width to get a plot that moves 
with the graph and shows the most recent parts of it in long time simulations. 

The vib. py code contains several functions for analyzing the time series signal 
and for visualizing the solutions. 
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1.10.7 User Interface 


The main function is changed substantially from the vib_undamped. py code, since 
we need to specify the new data c, s(u), and F(t). In addition, we must set T 
and the plot window width (instead of the number of periods we want to simu- 
late as in vib_undamped.py). To figure out whether we can use one plot for the 
whole time series or if we should follow the most recent part of u, we can use the 
plot_empricial_freq_and_amplitude function’s estimate of the number of lo- 
cal maxima. This number is now returned from the function and used in main to 
decide on the visualization technique. 


def main(): 
import argparse 
parser = argparse.ArgumentParser () 
parser.add_argument(’--I’, type=float, default=1.0) 
parser.add_argument(’--V’, type=float, default=0.0) 
parser.add_argument(’--m’, type=float, default=1.0) 
parser.add_argument(’--c’, type=float, default=0.0) 
parser.add_argument(’--s’, type=str, default=’u’) 
parser.add_argument(’--F’, type=str, default=’0’) 
parser.add_argument(’--dt’, type=float, default=0.05) 
parser.add_argument(’--T’, type=float, default=140) 
parser.add_argument(’--damping’, type=str, default=’ linear’) 
parser.add_argument(’--window_width’, type=float, default=30) 
parser.add_argument(’--savefig’, action=’store_true’) 
a = parser.parse_args() 
from scitools.std import StringFunction 
s = StringFunction(a.s, independent_variable=’u’) 
F = StringFunction(a.F, independent_variable=’t’) 
I, V, m, c, dt, T, window_width, savefig, damping = \ 
a.I, a.V, a.m, a.c, a.dt, a.T, a.window_width, a.savefig, \ 
a.damping 


il, we = solic, We, iy Cy Ey 15 ches 4D) 
num_periods = empirical_freq_and_amplitude(u, t) 
if num_periods <= 15: 

figure () 

visualize(u, t) 
else: 

visualize_front(u, t, window_width, savefig) 
show() 


The program vib.py contains the above code snippets and can solve the model 
problem (1.71). As a demo of vib. py, we consider the case J = 1, V =0,m = 1, 
c = 0.03, s(u) = sin(u), F(t) = 3cos(4t), At = 0.05, and T = 140. The 
relevant command to run is 


Terminal 


Terminal> python vib.py --s ’sin(u)’ --F ’3*cos(4*t)’ --c 0.03 
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dt=0.05 
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Fig. 1.16 Damped oscillator excited by a sinusoidal function 


This results in a moving window following the function!® on the screen. Figure 1.16 
shows a part of the time series. 


1.10.8 The Euler-Cromer Scheme for the Generalized Model 
The ideas of the Euler-Cromer method from Sect. 1.7 carry over to the generalized 


model. We write (1.71) as two equations for u and v = u’. The first equation is 
taken as the one with v’ on the left-hand side: 


1 
v = ao —s(u)— f(v)), (1.86) 
u=v. (1.87) 


Again, the idea is to step (1.86) forward using a standard Forward Euler method, 
while we update u from (1.87) with a Backward Euler method, utilizing the recent, 


computed v”t! value. In detail, 
yt — yl 1 a F 
A = Eln) - s(u") - fO), (1.88) 
t m 
u”+! — u” ji 
Spo 1.89 
Ni v (1.89) 


16 http://tinyurl.com/hbcasmj/vib/html//mov-vib/vib_generalized_dt0.05/index.html 
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resulting in the explicit scheme 


1 
il aul + Ar (FQ) sta") ~ flO") (1.90) 
n+l yh p Ag yt! | (1.91) 


z 
| 


We immediately note one very favorable feature of this scheme: all the nonlin- 
earities in s(u) and f(v) are evaluated at a previous time level. This makes the 
Euler-Cromer method easier to apply and hence much more convenient than the 
centered scheme for the second-order ODE (1.71). 

The initial conditions are trivially set as 


v? = V, (1.92) 
u? =I. (1.93) 


1.10.9 The Störmer-Verlet Algorithm for the Generalized Model 


We can easily apply the ideas from Sect. 1.7.4 to extend that method to the gener- 
alized model 


v = (FO - su) — fo), 
m 


u =v. 


However, since the scheme is essentially centered differences for the ODE system 
on a staggered mesh, we do not go into detail here, but refer to Sect. 1.10.10. 


1.10.10 A Staggered Euler-Cromer Scheme for a Generalized Model 
The more general model for vibration problems, 

mu” + f(u) + su) = F(t), u0) = I, u'(0) = V, t € (0,T], (1.94) 
can be rewritten as a first-order ODE system 


v =m! (F(t) — f(v) -— s(u)), (1.95) 
w =v. (1.96) 


It is natural to introduce a staggered mesh (see Sect. 1.8.1) and seek u at mesh 
points ¢, (the numerical value is denoted by u”) and v between mesh points at tn+1/2 


(the numerical value is denoted by vit), A centered difference approximation to 


(1.96)—(1.95) can then be written in operator notation as 


[Div = m (F(t) — fv) —s@))]", (1.97) 
[Du = vy"? (1.98) 
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Written out, 


pits = y'3 


=m '(F"— f(v")—s(u")), (1.99) 
= ytts, (1.100) 
With linear damping, f(v) = bv, we can use an arithmetic mean for f(v”): 


fv") x= Lf (v"-2) + f(v"2)). The system (1.99)-(1.100) can then be solved 


. 1 
with respect to the unknowns u” and v”t2: 


b o NU 1 
v't = (: + ar) (= + Atm! G — Ta — sw"))) ; 


2m 
(1.101) 
u” =u"! + Atu"? , (1.102) 
In case of quadratic damping, f(v) = b|v|v, we can use a geometric mean: 


tv") & blv"-2 Ju"+3, Inserting this approximation in (1.99)-(1.100) and solving 

for the unknowns u” and v"+> results in 

1 b 1 =l 1 1 
v't? = (1 + Ziar) (v= + Atm! (F" —s(u"))), (1.103) 
m 
u! =u" 4 Atv"? , (1.104 
The initial conditions are derived at the end of Sect. 1.8.1: 
0 _ 

u=T, (1.105) 


1 
vi=V— 5 Atel. (1.106) 


1.10.11 The PEFRL 4th-Order Accurate Algorithm 


A variant of the Euler-Cromer type of algorithm, which provides an error O(A?*) if 
f(v) = 0, is called PEFRL [14]. This algorithm is very well suited for integrating 
dynamic systems (especially those without damping) over very long time periods. 
Define 


1 
glu, v) = EFO — su) — f0). 
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The algorithm is explicit and features these steps: 


yt = y” + EArt", (1.107) 
1 
pith = v" ae 51 Z 2A)Atg (ur tht, v”), (1.108) 
u”t!2 = yt As zAtutth), (1.109) 
yt th = yrthl yg AAtg(u"*}?, yrthh (1.110) 
yrti3 2 u” t.2 4 a = 2x + E))Atu"t 2, (1.111) 
piti3 = piti2 4+ AAtg (ut, girl), (1.112) 
yori — yt + Atut, (1.113) 
1 
pt = pits fe N —2A)Atg (ut, v”t13), (1.114) 
yr! = yitia 4 EAty"t! . (1.115) 


The parameters £, A, and & have the values 


E = 0.1786178958448091, (1.116) 
A = —0.2123418310626054, (1.117) 
x = —0.0662645826698 1849 . (1.118) 


1.11 Exercises and Problems 


Exercise 1.19: Implement the solver via classes 

Reimplement the vib. py program using a class Problem to hold all the physical 
parameters of the problem, a class Solver to hold the numerical parameters and 
compute the solution, and a class Visualizer to display the solution. 


Hint Use the ideas and examples from Sections 5.5.1 and 5.5.2 in [9]. More specif- 
ically, make a superclass Problem for holding the scalar physical parameters of a 
problem and let subclasses implement the s(u) and F(t) functions as methods. Try 
to call up as much existing functionality in vib. py as possible. 

Filename: vib_class. 


Problem 1.20: Use a backward difference for the damping term 
As an alternative to discretizing the damping terms fu’ and B|u'|u’ by centered 
differences, we may apply backward differences: 


[u] x [D,u]", 
[lw lu]" ~ [Dru] Dru]" 
= |[Dru]"[Dru]" . 


The advantage of the backward difference is that the damping term is evaluated 
using known values u” and u”! only. Extend the vib.py code with a scheme 
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based on using backward differences in the damping terms. Add statements to 
compare the original approach with centered difference and the new idea launched 
in this exercise. Perform numerical experiments to investigate how much accuracy 
that is lost by using the backward differences. 

Filename: vib_gen_bwdamping. 


Exercise 1.21: Use the forward-backward scheme with quadratic damping 
We consider the generalized model with quadratic damping, expressed as a system 
of two first-order equations as in Sect. 1.10.10: 


u = V, 
of =~ (FO —Aloly— stu). 
m 


However, contrary to what is done in Sect. 1.10.10, we want to apply the idea 
of a forward-backward discretization: u is marched forward by a one-sided For- 
ward Euler scheme applied to the first equation, and thereafter v can be marched 
forward by a Backward Euler scheme in the second equation, see in Sect. 1.7. Ex- 
press the idea in operator notation and write out the scheme. Unfortunately, the 
backward difference for the v equation creates a nonlinearity |v”+!|v”+!. To lin- 
earize this nonlinearity, use the known value v” inside the absolute value factor, i.e., 
|v’*!|v"+! a |v"|v"+!, Show that the resulting scheme is equivalent to the one in 
Sect. 1.10.10 for some time level n > 1. 

What we learn from this exercise is that the first-order differences and the lin- 
earization trick play together in “the right way” such that the scheme is as good 
as when we (in Sect. 1.10.10) carefully apply centered differences and a geometric 
mean on a staggered mesh to achieve second-order accuracy. There is a difference 
in the handling of the initial conditions, though, as explained at the end of Sect. 1.7. 
Filename: vib_gen_bwdamping. 


1.12 Applications of Vibration Models 


The following text derives some of the most well-known physical problems that 
lead to second-order ODE models of the type addressed in this book. We consider 
a simple spring-mass system; thereafter extended with nonlinear spring, damping, 
and external excitation; a spring-mass system with sliding friction; a simple and a 
physical (classical) pendulum; and an elastic pendulum. 


1.12.1 Oscillating Mass Attached to a Spring 


The most fundamental mechanical vibration system is depicted in Fig. 1.17. A body 
with mass m is attached to a spring and can move horizontally without friction (in 
the wheels). The position of the body is given by the vector r (t) = u(t)i, where i 
is a unit vector in x direction. There is only one force acting on the body: a spring 
force F, = —kui, where k is a constant. The point x = 0, where u = 0, must 


68 1 Vibration ODEs 


Fig. 1.17 Simple oscillating mass 


therefore correspond to the body’s position where the spring is neither extended nor 
compressed, so the force vanishes. 

The basic physical principle that governs the motion of the body is Newton’s 
second law of motion: F = ma, where F is the sum of forces on the body, m 
is its mass, and a = F is the acceleration. We use the dot for differentiation with 
respect to time, which is usual in mechanics. Newton’s second law simplifies here 
to —F , = müi , which translates to 


—ku = mü. 


Two initial conditions are needed: u(0) = J, u(0) = V. The ODE problem is 
normally written as 


mi+ku=0, u(0)=7, ù) =V. (1.119) 
It is not uncommon to divide by m and introduce the frequency w = y k/m: 
it+to’u=0, u(0)=/, ùO) =V. (1.120) 


This is the model problem in the first part of this chapter, with the small difference 
that we write the time derivative of u with a dot above, while we used uv’ and u” in 
previous parts of the book. 

Since only one scalar mathematical quantity, u(t), describes the complete mo- 
tion, we say that the mechanical system has one degree of freedom (DOF). 


Scaling For numerical simulations it is very convenient to scale (1.120) and 
thereby get rid of the problem of finding relevant values for all the parameters m, 
k, I, and V. Since the amplitude of the oscillations are dictated by Z and V (or 
more precisely, V/w), we scale u by I (or V/w if I = 0): 

= u = t 

ū=-—, t=—. 

I te 

The time scale t is normally chosen as the inverse period 27/w or angular fre- 
quency 1/@, most often as te = 1/q. Inserting the dimensionless quantities u and 
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Fig. 1.18 General oscillating system 


t in (1.120) results in the scaled problem 


dù 

—+u=0, u(0)=l1, 

JA (0) 
where £ is a dimensionless number. Any motion that starts from rest (V = 0) is 
free of parameters in the scaled model! 


S| &ı 


V 
OO) = p= 


The physics The typical physics of the system in Fig. 1.17 can be described as fol- 
lows. Initially, we displace the body to some position Z, say at rest (V = 0). After 
releasing the body, the spring, which is extended, will act with a force —k Ii and 
pull the body to the left. This force causes an acceleration and therefore increases 
velocity. The body passes the point x = 0, where u = 0, and the spring will then 
be compressed and act with a force kxi against the motion and cause retardation. 
At some point, the motion stops and the velocity is zero, before the spring force 
kxi has worked long enough to push the body in positive direction. The result is 
that the body accelerates back and forth. As long as there is no friction forces to 
damp the motion, the oscillations will continue forever. 


1.12.2 General Mechanical Vibrating System 


The mechanical system in Fig. 1.17 can easily be extended to the more general 
system in Fig. 1.18, where the body is attached to a spring and a dashpot, and also 
subject to an environmental force F(t)i. The system has still only one degree of 
freedom since the body can only move back and forth parallel to the x axis. The 
spring force was linear, F, = —kui, in Sect. 1.12.1, but in more general cases it 
can depend nonlinearly on the position. We therefore set F , = s(u)i. The dashpot, 
which acts as a damper, results in a force F 4 that depends on the body’s velocity 
u and that always acts against the motion. The mathematical model of the force is 
written Fz = f(u)i. A positive ù must result in a force acting in the positive x 
direction. Finally, we have the external environmental force F, = F(t)i. 
Newton’s second law of motion now involves three forces: 


F(t)i — fi —s(wi = müi . 
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The common mathematical form of the ODE problem is 
mü + f(ù)+ slu) = F(t), u(0)=7, uO)=V. (1.121) 


This is the generalized problem treated in the last part of the present chapter, but 
with prime denoting the derivative instead of the dot. 

The most common models for the spring and dashpot are linear: f (ù) = bu 
with a constant b > 0, and s(u) = ku for a constant k. 


Scaling A specific scaling requires specific choices of f, s, and F. Suppose we 
have 


fù) = blu|u, s(u)= ku, F(t) = Asin(ġt). 


We introduce dimensionless variables as usual, 7 = u/u, and f = t/t,. The scale 
uc depends both on the initial conditions and F, but as time grows, the effect of the 
initial conditions die out and F will drive the motion. Inserting ù and f in the ODE 
gives 

du 
dt 


uc du u? du 5 i 7 
£ J7 + kuc = Asin(ġtet). 


m = 
t2 df? 12 


We divide by u,/t? and demand the coefficients of the i and the forcing term from 
F(t) to have unit coefficients. This leads to the scales 


i m A 
io T Uc = >. 
k k 
The scaled ODE becomes 
du diu| di 
= 2 =| — + ù = sin(yt), 1.122 
m? Je ap +7 = sin(yi) (1.122) 


where there are two dimensionless numbers: 


B= Ab B B 
~ Ink’ a k 


The 6 number measures the size of the damping term (relative to unity) and is 
assumed to be small, basically because b is small. The ¢ number is the ratio of the 
time scale of free vibrations and the time scale of the forcing. The scaled initial 
conditions have two other dimensionless numbers as values: 

Ik du te V 


u(0) = T’ oo a mk. 


1.12.3 A Sliding Mass Attached to a Spring 


Consider a variant of the oscillating body in Sect. 1.12.1 and Fig. 1.17: the body 
rests on a flat surface, and there is sliding friction between the body and the surface. 
Figure 1.19 depicts the problem. 
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Fig. 1.19 Sketch of a body sliding on a surface 


The body is attached to a spring with spring force —s(u)i. The friction force 
is proportional to the normal force on the surface, —mg j , and given by — f(u)i, 
where 


—umg, u<0O, 
fu) = 4 umg, u> 0, 
0, u=0O 


Here, u is a friction coefficient. With the signum function 


-l, x <0, 
sign(x) = 1, x >0, 
0, x=0 


we can simply write f (ù) = umg sign(ù) (the sign function is implemented by 
numpy.sign). 
The equation of motion becomes 


mü + umgsign(u) + s(u) = 0, u(0)=T7, ù(0) =V. (1.123) 


1.12.4 A Jumping Washing Machine 


A washing machine is placed on four springs with efficient dampers. If the machine 
contains just a few clothes, the circular motion of the machine induces a sinusoidal 
external force from the floor and the machine will jump up and down if the fre- 
quency of the external force is close to the natural frequency of the machine and its 
spring-damper system. 


1.12.5 Motion of a Pendulum 


Simple pendulum A classical problem in mechanics is the motion of a pendulum. 
We first consider a simplified pendulum!’ (sometimes also called a mathematical 


17 https://en.wikipedia.org/wiki/Pendulum 
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Fig.1.20 Sketch of a simple 
pendulum 


pendulum): a small body of mass m is attached to a massless wire and can oscillate 
back and forth in the gravity field. Figure 1.20 shows a sketch of the problem. 

The motion is governed by Newton’s 2nd law, so we need to find expressions 
for the forces and the acceleration. Three forces on the body are considered: an 
unknown force S from the wire, the gravity force mg, and an air resistance force, 
$C pDOQA|v|v, hereafter called the drag force, directed against the velocity of the 
body. Here, Cp is a drag coefficient, ọ is the density of air, A is the cross section 
area of the body, and v is the magnitude of the velocity. 

We introduce a coordinate system with polar coordinates and unit vectors i , and 
ig as shown in Fig. 1.21. The position of the center of mass of the body is 


r(t) = xoi + yoj + Li», 


where i and j are unit vectors in the corresponding Cartesian coordinate system in 
the x and y directions, respectively. We have that i, = cos Oi + sinô j. 
The forces are now expressed as follows. 


e Wire force: —Si, 
e Gravity force: —mgj = mg(—sin@ig +cosi,) 
e Drag force: —}CpeAlv|v ig 


Since a positive velocity means movement in the direction of ig, the drag force 
must be directed along —i ọ so it works against the motion. We assume motion in 
air so that the added mass effect can be neglected (for a spherical body, the added 
mass is 5oV, where V is the volume of the body). Also the buoyancy effect can be 
neglected for motion in the air when the density difference between the fluid and 
the body is so significant. 
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Fig. 1.21 Forces acting on 


ig 
a simple pendulum 
(Xp Yo ) i 
l i, 


~|v|v 


The velocity of the body is found from r: 
d dé . 
v(t) = #@) = Tp oi + yoj + Li) = Lbia, 


since si r = ig. It follows that v = |v| = LO. The acceleration is 


a(t) = v(r) = L (Lis) = LOig + Lotti = Lbig —L0’i,, 


since Lio =-I,. 
Newton’s 2nd law of motion becomes 


1 Da Bog : 
—Si, ++mg(—sin@ ig + cos0i,) — 5 CDOAL*|6|0 ia = mL66 ig — L6i;, 
leading to two component equations 
—S + mg cos ð = —L6, (1.124) 


1 32 3 
—mg sin 9 — 5 CD0AL"|616 =mLé. (1.125) 


From (1.124) we get an expression for S = mg cos 0 + L6?, and from (1.125) we 
get a differential equation for the angle 0 (t). This latter equation is ordered as 


a 1 na 
mË + zCpoALIIÖ + “sin =0. (1.126) 


Two initial conditions are needed: 0 = © and Ê = 2. Normally, the pendulum 
motion is started from rest, which means 2 = 0. 
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Equation (1.126) fits the general model used in (1.71) in Sect. 1.10 if we define 
u=06, f(v’)= SCpeAL|ii|u, s(u) = L~'mg sinu, and F = 0. If the body is a 
sphere with radius R, we can take Cp = 0.4 and A = wR’. Exercise 1.25 asks you 
to scale the equations and carry out specific simulations with this model. 


Physical pendulum The motion of a compound or physical pendulum where the 
wire is a rod with mass, can be modeled very similarly. The governing equation 
is Ja = T where J is the moment of inertia of the entire body about the point 
(xo, Yo), and T is the sum of moments of the forces with respect to (xo, yo). The 
vector equation reads 


1 oe 
rx (-si, + mg(—sin ig + cos #i,) — 5C0ALé|6io) 
= 1(L06i9 — L67i,). 


The component equation in 7g direction gives the equation of motion for @(t): 


| ee 
I0 + 5 Co0AL*|6|0 + mgL sin =0. (1.127) 


1.12.6 Dynamic Free Body Diagram During Pendulum Motion 


Usually one plots the mathematical quantities as functions of time to visualize the 
solution of ODE models. Exercise 1.25 asks you to do this for the motion of a 
pendulum in the previous section. However, sometimes it is more instructive to look 
at other types of visualizations. For example, we have the pendulum and the free 
body diagram in Fig. 1.20 and 1.21. We may think of these figures as animations 
in time instead. Especially the free body diagram will show both the motion of 
the pendulum and the size of the forces during the motion. The present section 
exemplifies how to make such a dynamic body diagram. Two typical snapshots 
of free body diagrams are displayed below (the drag force is magnified 5 times to 
become more visual!). 


x 
D 
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Dynamic physical sketches, coupled to the numerical solution of differential 
equations, requires a program to produce a sketch for the situation at each time 
level. Pysketcher!® is such a tool. In fact (and not surprising!) Fig. 1.20 and 1.21 
were drawn using Pysketcher. The details of the drawings are explained in the Pys- 
ketcher tutorial!?. Here, we outline how this type of sketch can be used to create an 
animated free body diagram during the motion of a pendulum. 

Pysketcher is actually a layer of useful abstractions on top of standard plotting 
packages. This means that we in fact apply Matplotlib to make the animated free 
body diagram, but instead of dealing with a wealth of detailed Matplotlib com- 
mands, we can express the drawing in terms of more high-level objects, e.g., objects 
for the wire, angle 0, body with mass m, arrows for forces, etc. When the position 
of these objects are given through variables, we can just couple those variables to 
the dynamic solution of our ODE and thereby make a unique drawing for each 0 
value in a simulation. 


Writing the solver Let us start with the most familiar part of the current problem: 
writing the solver function. We use Odespy for this purpose. We also work with 
dimensionless equations. Since 0 can be viewed as dimensionless, we only need 
to introduce a dimensionless time, here taken as f = t/./L/g. The resulting di- 
mensionless mathematical model for 6, the dimensionless angular velocity w, the 
dimensionless wire force S, and the dimensionless drag force D is then 


d 
= = —aløolw — sind, (1.128) 
a“ (1.129) 
== = W; 4 
dt 
S = w° + cos, (1.130) 
D =-alolo, (1.131) 
with 
CponrR?L 
a = ———__., 
2m 


as a dimensionless parameter expressing the ratio of the drag force and the gravity 
force. The dimensionless w is made non-dimensional by the time, so œ y L/g is 
the corresponding angular frequency with dimensions. 

A suitable function for computing (1.128)—(1.131) is listed below. 


def simulate(alpha, Theta, dt, T): 
import odespy 


def f(u, t, alpha): 
omega, theta = u 
return [-alpha*omega*abs (omega) - sin(theta), 
omega] 


18 https://github.com/hplgit/pysketcher 
19 http://hplgit.github.io/pysketcher/doc/web/index.html 
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import numpy as np 
Nt = int (round(T/float (dt) )) 
t = np.linspace(0, Nt*dt, Nt+1) 
solver = odespy.RK4(f, f_args=[alpha] ) 
solver.set_initial_condition([0, Theta] ) 
u, t = solver.solve( 

t, terminate=lambda u, t, n: abs(ul[n,1]) < 1E-3) 
omega = u[:,0] 
theta = ul:,1] 
S = omega**2 + np.cos(theta) 
drag = -alpha*np.abs (omega) *omega 
return t, theta, omega, S, drag 


Drawing the free body diagram The sketch function below applies Pysketcher 
objects to build a diagram like that in Fig. 1.21, except that we have removed the 
rotation point (xo, yo) and the unit vectors in polar coordinates as these objects are 
not important for an animated free body diagram. 


import sys 


try: 


from pysketcher import * 


except ImportError: 


print ’Pysketcher must be installed from’ 
print ’https://github.com/hplgit/pysketcher’ 
sys.exit (1) 


# Overall dimensions of sketch 


= 


W= 


15. 
17. 


drawing_tool.set_coordinate_system( 


def 


xmin=0, xmax=W, ymin=0, ymax=H, 
axis=False) 


sketch(theta, S, mg, drag, t, time_level): 

nun 

Draw pendulum sketch with body forces at a time level 
corresponding to time t. The drag force is in 
drag[time_level], the force in the wire is S[time_level], 
the angle is theta[time_level]. 

nun 

import math 

a = math.degrees(theta[time_level]) # angle in degrees 
L = 0.4*H # Length of pendulum 

P = (W/2, 0.8*H) # Fixed rotation point 


mass_pt = path.geometric_features() [’end’] 
rod = Line(P, mass_pt) 


mass = Circle(center=mass_pt, radius=L/20.) 

mass.set_filled_curves(color=’ blue’) 

rod_vec = rod.geometric_features()[’end’] - \ 
rod.geometric_features() [’start’] 

unit_rod_vec = unit_vec(rod_vec) 

mass_symbol = Text(’$m$’, mass_pt + L/10*unit_rod_vec) 
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rod_start = rod.geometric_features()[’start’] # Point P 
vertical = Line(rod_start, rod_start + point(0,-L/3)) 


def set_dashed_thin_blackline(*objects): 
"""Set linestyle of objects to dashed, black, width=1.""" 
for obj in objects: 
obj.set_linestyle(’ dashed’) 
obj.set_linecolor(’ black’) 
obj.set_linewidth(1) 


set_dashed_thin_blackline (vertical) 

set_dashed_thin_blackline (rod) 

angle = Arc_wText(r’$\theta$’, rod_start, L/6, -90, a, 
text_spacing=1/30. ) 


magnitude = 1.2*L/2 # length of a unit force in figure 

force = mg[time_level] # constant (scaled eq: about 1) 

force *= magnitude 

mg force = Force(mass_pt, mass_pt + force*point(0,-1), 
2? , text_pos=’ end’) 

force = S[time_level] 

force *= magnitude 

rod_force = Force(mass_pt, mass_pt - force*unit_vec(rod_vec), 
7? , text_pos=’end’, 
text_spacing=(0.03, 0.01)) 

force = drag[time_level] 

force *= magnitude 

air_force = Force(mass_pt, mass_pt - 
force*unit_vec((rod_vec[1i], -rod_vec[0])), 
77, text_pos=’end’, 
text_spacing=(0.04,0.005)) 


body_diagram = Composition( 
{’mg’: me force, ’S’: rod forco, ’air’: air_force, 
iroda: rod, ’body’: mass 
’ vertical’: vertical, ’theta’: angle,}) 


body_diagram.draw(verbose=0) 
drawing_tool.savefig(’tmp_/04d.png’ % time_level, crop=False) 
# (No cropping: otherwise movies will be very strange!) 
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Making the animated free body diagram It now remains to couple the simulate 


and sketch functions. We first run simulate: 


from math import pi, radians, degrees 

import numpy as np 

alpha = 0.4 

period = 2xpi # Use small theta approximation 
T = 12*period # Simulate for 12 periods 

dt = period/40 # 40 time steps per period 

a = 70 # Initial amplitude in degrees 
Theta = radians(a) 


t, theta, omega, S, drag = simulate(alpha, Theta, dt, T) 
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The next step is to run through the time levels in the simulation and make a sketch 
at each level: 


for time_level, t_ in enumerate(t): 
sketch(theta, S, mg, drag, t_, time_level) 


The individual sketches are (by the sketch function) saved in files with names 
tmp_/,04d. png. These can be combined to videos using (e.g.) ffmpeg. A complete 
function animate for running the simulation and creating video files is listed below. 


def animate(): 
# Clean up old plot files 
import os, glob 
for filename in glob.glob(’tmp_*.png’) + glob.glob(’movie.*’): 
os.remove (filename) 
# Solve problem 
from math import pi, radians, degrees 
import numpy as np 
alpha = 0.4 
period = 2*pi # Use small theta approximation 
T = 12*period # Simulate for 12 periods 
dt = period/40 # 40 time steps per period 
a = 70 # Initial amplitude in degrees 
Theta = radians(a) 


t, theta, omega, S, drag = simulate(alpha, Theta, dt, T) 


# Visualize drag force 5 times as large 
drag *= 5 
mg = np.ones(S.size) # Gravity force (needed in sketch) 


# Draw animation 

import time 

for time_level, t_ in enumerate(t): 
sketch(theta, S, mg, drag, t_, time_level) 
time.sleep(0.2) # Pause between each frame on the screen 


# Make videos 
prog = ’ffmpeg’ 
filename = ’tmp_%04d.png’ 
fps = 6 
codecs = {’flv’: ’flv’, ’mp4’: ’libx264’, 
>webm’: ?libvpx’, ’ogg’: ’libtheora’} 
for ext in codecs: 
lib = codecs[ext] 
cmd = ’%(prog)s -i %(filename)s -r %(fps)s ’ % vars() 
cmd += ’-vcodec %(lib)s movie.%(ext)s’ % vars() 
print (cmd) 
os.system(cmd) 
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1.12.7 Motion of an Elastic Pendulum 


Consider a pendulum as in Fig. 1.20, but this time the wire is elastic. The length of 
the wire when it is not stretched is Ly, while L(t) is the stretched length at time t 
during the motion. 

Stretching the elastic wire a distance AL gives rise to a spring force kA L in the 
opposite direction of the stretching. Let m be a unit normal vector along the wire 
from the point ro = (Xo, yo) and in the direction of ig, see Fig. 1.21 for definition 
of (xo, Yo) and ig. Obviously, we have n = ig, but in this modeling of an elastic 
pendulum we do not need polar coordinates. Instead, it is more straightforward to 
develop the equation in Cartesian coordinates. 

A mathematical expression for n is 


= r—Tro 
~ L(t)’ 


where L(t) = ||r — ro|| is the current length of the elastic wire. The position 
vector r in Cartesian coordinates reads r (t) = x(t)i + y(t)j, where i and j are 
unit vectors in the x and y directions, respectively. It is convenient to introduce the 
Cartesian components ny and n, of the normal vector: 


orro x@-—xo, yA-  ;, ; 
n= LO = LO I+ LO J =nxl +nyJ. 


The stretch AL in the wire is 
At = L(t)—Lo. 
The force in the wire is then —Sn = —kALn. 

The other forces are the gravity and the air resistance, just as in Fig. 1.21. For 
motion in air we can neglect the added mass and buoyancy effects. The main dif- 
ference is that we have a model for S in terms of the motion (as soon as we have 
expressed AL by r). For simplicity, we drop the air resistance term (but Exer- 


cise 1.27 asks you to include it). 
Newton’s second law of motion applied to the body now results in 


mé = —k(L —Lo)n—mgj. (1.132) 
The two components of (1.132) are 
M k 
X = ——(L — Lo)nx, (1.133) 
m 


: k 
§ = ——(L— Lo)ny ~g. (1.134) 


80 1 Vibration ODEs 


Remarks about an elastic vs a non-elastic pendulum Note that the derivation 
of the ODEs for an elastic pendulum is more straightforward than for a classical, 
non-elastic pendulum, since we avoid the details with polar coordinates, but instead 
work with Newton’s second law directly in Cartesian coordinates. The reason why 
we can do this is that the elastic pendulum undergoes a general two-dimensional 
motion where all the forces are known or expressed as functions of x(t) and y(t), 
such that we get two ordinary differential equations. The motion of the non-elastic 
pendulum, on the other hand, is constrained: the body has to move along a circular 
path, and the force S in the wire is unknown. 

The non-elastic pendulum therefore leads to a differential-algebraic equation, 
i.e., ODEs for x(t) and y(t) combined with an extra constraint (x — xo)? + (y — 
yo)? = L? ensuring that the motion takes place along a circular path. The extra 
constraint (equation) is compensated by an extra unknown force — Sn. Differential- 
algebraic equations are normally hard to solve, especially with pen and paper. 
Fortunately, for the non-elastic pendulum we can do a trick: in polar coordinates 
the unknown force S appears only in the radial component of Newton’s second law, 
while the unknown degree of freedom for describing the motion, the angle 6(t), is 
completely governed by the asimuthal component. This allows us to decouple the 
unknowns S and 0. But this is a kind of trick and not a widely applicable method. 
With an elastic pendulum we use straightforward reasoning with Newton’s 2nd law 
and arrive at a standard ODE problem that (after scaling) is easy to solve on a com- 
puter. 


Initial conditions What is the initial position of the body? We imagine that first 
the pendulum hangs in equilibrium in its vertical position, and then it is displaced an 
angle ©. The equilibrium position is governed by the ODEs with the accelerations 
set to zero. The x component leads to x(t) = xo, while the y component gives 


0= = y — Lo)ny — g = * (10) —Lo)-g = LO) =Lo+me/k, 
since ny = —11 in this position. The corresponding y value is then from n, = —1: 
y(t) = yo — L(0) = yo — (Lo + mg/k). 
Let us now choose (xo, yo) such that the body is at the origin in the equilibrium 


position: 
xo =0, yo=Llotmeg/k. 


Displacing the body an angle © to the right leads to the initial position 
x(0) = (Lo + mg/k)sin ©, y(0) = (Lo + mg/k)(1 — cos O). 


The initial velocities can be set to zero: x’(0) = y’(0) = 0. 
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The complete ODE problem We can summarize all the equations as follows: 
k 
x = —— (L — Lo)nx, 
m 


p k 
ğ=-—(L- Lo)ny — 8, 
m 


L = y(x — x0)? + O — yo), 


X= XO 
n, = — 7, 
L 
_ ¥— Yo 
n, = , 
L 
x(0) = (Lo + mg/ k) sin ØO, 
x'(0) = 0, 
y(0) = (Lo + mg/k)(1 — cos O), 
y0) =0. 


We insert nx and n, in the ODEs: 


= k Lo 
x=-— (1-2) (x — xo), (1.135) 

m L 

7 k Lo 
La (1-2) Q — yo) — 8, (1.136) 

m L 
L = V(x — x0)? + Q — yo)’, (1.137) 
x(0) = (Lo + mg/k) sin O, (1.138) 
x'(0) = 0, (1.139) 
y(0) = (Lo + mg/k)(1 — cos O), (1.140) 
y'(0) =0. (1.141) 


Scaling The elastic pendulum model can be used to study both an elastic pendulum 
and a classic, non-elastic pendulum. The latter problem is obtained by letting k > 
oo. Unfortunately, a serious problem with the ODEs (1.135)—(1.136) is that for 
large k, we have a very large factor k/m multiplied by a very small number 1 — 
Lo/L, since for large k, L ~ Lo (very small deformations of the wire). The 
product is subject to significant round-off errors for many relevant physical values 
of the parameters. To circumvent the problem, we introduce a scaling. This will also 
remove physical parameters from the problem such that we end up with only one 
dimensionless parameter, closely related to the elasticity of the wire. Simulations 
can then be done by setting just this dimensionless parameter. 

The characteristic length can be taken such that in equilibrium, the scaled length 
is unity, i.e., the characteristic length is Ly) + mg/k: 


x y 


y= ——_, y= —*__, 
Lo tmg/k’ > Lo+me/k 


We must then also work with the scaled length L = L/(Lo + mg/k). 
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Introducing f = t/t,, where tf, is a characteristic time we have to decide upon 
later, one gets 


ah 2k Lo 1\_ 

=a — —t* 1— = X, 

dt? m Lo+mg/k L 
25 k L 1 

Sa (i= 2 __)g@-1-2_4 _, 

dt? m Lo +mg/k L Lo +mg/k 
L= ¥x?+(-1), 

x(0) = sin O, 

*'(0) =0, 

y(0) = 1 — cos Ø, 

70) =0. 


For a non-elastic pendulum with small angles, we know that the frequency of the 
oscillations are œ = „y L/g. It is therefore natural to choose a similar expression 
here, either the length in the equilibrium position, 


ee Lo +mg/k 
‘ g 


or simply the unstretched length, 


These quantities are not very different (since the elastic model is valid only for quite 
small elongations), so we take the latter as it is the simplest one. 
The ODEs become 


d*x _ Lok (1- Lo N 
dt? mg Lo+mg/kL)” 
eee iy Nei 
d? — mg Lo+mg/k L 4 Lo +mg/k’ 


L= Vx? + (9-1). 
We can now identify a dimensionless number 


E Lo ee 
~ Lo+mg/k 1+5 


f 
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which is the ratio of the unstretched length and the stretched length in equilibrium. 
The non-elastic pendulum will have f = 1 (k — oo). With 6 the ODEs read 


ax Æ B\ 
T (1- E)a, (1.142) 
d Æ BAs 
JP =-£, (1-F) 6-0-8, (1.143) 
L= Vx? + (9-12, (1.144) 
x(0) = (1 + €)sin@, (1.145) 
an = 1.146 
AF )=0, (1.146) 
y(0) = 1 — (1 + €) cos Ø, (1.147) 
dy 
WO = 0, (1.148) 


We have here added a parameter €, which is an additional downward stretch of 
the wire at t = 0. This parameter makes it possible to do a desired test: vertical 
oscillations of the pendulum. Without €, starting the motion from (0,0) with zero 
velocity will result in x = y = 0 for all times (also a good test!), but with an initial 
stretch so the body’s position is (0, €), we will have oscillatory vertical motion with 
amplitude e (see Exercise 1.26). 


Remark on the non-elastic limit We immediately see that as k — oo (i.e., we 
obtain a non-elastic pendulum), 6 — 1, L= 1, and we have very small values 
1 — L~! divided by very small values 1 — $ in the ODEs. However, it turns 
out that we can set 6 very close to one and obtain a path of the body that within the 
visual accuracy of a plot does not show any elastic oscillations. (Should the division 
of very small values become a problem, one can study the limit by L’ Hospital’s rule: 


and use the limit L7! in the ODEs for f values very close to 1.) 


1.12.8 Vehicle on a Bumpy Road 


We consider a very simplistic vehicle, on one wheel, rolling along a bumpy road. 
The oscillatory nature of the road will induce an external forcing on the spring 
system in the vehicle and cause vibrations. Figure 1.22 outlines the situation. 

To derive the equation that governs the motion, we must first establish the posi- 
tion vector of the black mass at the top of the spring. Suppose the spring has length 
L without any elongation or compression, suppose the radius of the wheel is R, and 
suppose the height of the black mass at the top is H. With the aid of the ro vector 
in Fig. 1.22, the position r of the center point of the mass is 


1 
r=ro+2Rj +Lj tuj + 5H. (1.149) 
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Fig. 1.22 Sketch of one- 
wheel vehicle on a bumpy 
road 


To 


where u is the elongation or compression in the spring according to the (unknown 
and to be computed) vertical displacement u relative to the road. If the vehicle 
travels with constant horizontal velocity v and h(x) is the shape of the road, then 
the vector ro is 

ro = vti +h(vt)j, 


if the motion starts from x = 0 at time t = 0. 

The forces on the mass is the gravity, the spring force, and an optional damping 
force that is proportional to the vertical velocity ú. Newton’s second law of motion 
then tells that 

mé = —mg j —s(u)—buj. 
This leads to 


mii = —s(u) — bu — mg — mh" (vt)v? . 


To simplify a little bit, we omit the gravity force mg in comparison with the 
other terms. Introducing u’ for ù then gives a standard damped, vibration equation 
with external forcing: 


mu” + bu’ + s(u) = —mh"(vt)v? . (1.150) 
Since the road is normally known just as a set of array values, h” must be computed 


by finite differences. Let Ax be the spacing between measured values h; = h(i Ax) 
on the road. The discrete second-order derivative h” reads 


_ Ai — 2h; + hist 


i= Ax2 i=1,...,N.—-1. 
x 


We may for maximum simplicity set the end points as go = qı and gy, = n,-1- 
The term —mh" (vt)v corresponds to a force with discrete time values 


F” = =mqnv?, At = v! Ax. 
This force can be directly used in a numerical model 


[mD,;D,u+ bDyu + s(u) = FP. 
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Software for computing u and also making an animated sketch of the mo- 
tion like we did in Sect. 1.12.6 is found in a separate project on the web: 
https: //github.com/hplgit/bumpy. You may start looking at the tutorial’. 


1.12.9 Bouncing Ball 


A bouncing ball is a ball in free vertically fall until it impacts the ground, but during 
the impact, some kinetic energy is lost, and a new motion upwards with reduced 
velocity starts. After the motion is retarded, a new free fall starts, and the process is 
repeated. At some point the velocity close to the ground is so small that the ball is 
considered to be finally at rest. 

The motion of the ball falling in air is governed by Newton’s second law F = 
ma, where a is the acceleration of the body, m is the mass, and F is the sum of all 
forces. Here, we neglect the air resistance so that gravity —mg is the only force. 
The height of the ball is denoted by / and v is the velocity. The relations between 
h, v, and a, 


AO=v0), vO =a), 


combined with Newton’s second law gives the ODE model 
h'(t) = -g, (1.151) 
or expressed alternatively as a system of first-order equations: 


v(t) = -8, (1.152) 
h'(t) = v(t). (1.153) 


These equations govern the motion as long as the ball is away from the ground by a 
small distance €, > 0. When h < €p, we have two cases. 


1. The ball impacts the ground, recognized by a sufficiently large negative velocity 
(v < —€,). The velocity then changes sign and is reduced by a factor Cr, known 
as the coefficient of restitution?!. For plotting purposes, one may set h = 0. 

2. The motion stops, recognized by a sufficiently small velocity (|v| < €,) close to 
the ground. 


1.12.10 Two-Body Gravitational Problem 


Consider two astronomical objects A and B that attract each other by gravitational 
forces. A and B could be two stars in a binary system, a planet orbiting a star, or 
a moon orbiting a planet. Each object is acted upon by the gravitational force due 
to the other object. Consider motion in a plane (for simplicity) and let (x4, y4) and 
(xg, yg) be the positions of object A and B, respectively. 


20 http://hplgit.github.io/bumpy/doc/pub/bumpy.pdf 
21 http://en.wikipedia.org/wiki/Coefficient_of_restitution 
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The governing equations Newton’s second law of motion applied to each object 
is all we need to set up a mathematical model for this physical problem: 


mažą =F, (1.154) 
mgłg = —F, (1.155) 
where F is the gravitational force 
Gmampg 
= —— F 
Iiri 


where 
r(t) = xg(t)— x4(t), 


and G is the gravitational constant: G = 6.674 - 107!! Nm? /kg?. 


Scaling A problem with these equations is that the parameters are very large (m4, 
mp, ||r ||) or very small (G). The rotation time for binary stars can be very small 
and large as well. It is therefore advantageous to scale the equations. A natural 
length scale could be the initial distance between the objects: L = r(0). We write 
the dimensionless quantities as 


The gravity force is transformed to 


Gmamp _ o = _ 
= aah, Fps 
L?||F\/9 
so the first ODE for x 4 becomes 


d*x4 _ Gmpgt? r 
dt? L? JFB? 


Assuming that quantities with a bar and their derivatives are around unity in size, it 
is natural to choose te such that the fraction Gmgt, / L? =1: 


L3 
Gmg ` 


From the other equation for xg we get another candidate for te with m4 instead of 
mpg. Which mass we choose play a role if m4 < mpg or mg & my. One solution is 


to use the sum of the masses: 
L3 
— 
G(m, + mpg) 
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Taking a look at Kepler’s laws?” of planetary motion, the orbital period for a planet 
around the star is given by the t- above, except for a missing factor of 27x, but that 
means that t7 l is just the angular frequency of the motion. Our characteristic time 
te is therefore highly relevant. Introducing the dimensionless number 


mA 
a= — 
MB 
we can write the dimensionless ODE as 
dxa 1 F (1.156) 
d?  1+a |F|? l 
ax 1 F 
Xe r (1.157) 


de 1+ |F’ 


In the limit m4 < mp, i.e., œ < 1, object B stands still, say xg = 0, and object 
A orbits according to 


Solution in a special case: planet orbiting a star To better see the motion, and 
that our scaling is reasonable, we introduce polar coordinates r and 0: 


xa =rcosĝi +rsindj, 


which means x4 can be written as x4 = ri,. Since 


d . : 
Get T Pio grie T Pir 
we have 5 
d2x A z f 
r = (#—rġÐi, + (rb + 276)ig. 
The equation of motion for mass A is then 
. 1 
ss D 
ř-r0 = -7 
r6 +276 =0. 


The special case of circular motion, r = 1, fulfills the equations, since the latter 
equation then gives 0 = const and the former then gives 0 = 1, i.e., the motion 
is r(t) = 1, 0(t) = t, with unit angular frequency as expected and period 27 as 
expected. 


22 https://en.wikipedia.org/wiki/Kepler%27s_laws_of_planetary_motion 
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1.12.11 Electric Circuits 


Although the term “mechanical vibrations” is used in the present book, we must 
mention that the same type of equations arise when modeling electric circuits. The 
current I(t) in a circuit with an inductor with inductance L, a capacitor with capac- 
itance C, and overall resistance R, is governed by 


» Ra 1 : 
I+ —I + — I = V(t), 1.158 
+i Fic (t) (1.158) 
where V(t) is the voltage source powering the circuit. This equation has the same 


form as the general model considered in Sect. 1.10 if we set u = I, f(u’) = bu’ 
and define b = R/L, s(u) = L~'C7!u, and F(t) = V(t). 


1.13 Exercises 


Exercise 1.22: Simulate resonance 
We consider the scaled ODE model (1.122) from Sect. 1.12.2. After scaling, the 
amplitude of u will have a size about unity as time grows and the effect of the 
initial conditions die out due to damping. However, as y — 1, the amplitude of u 
increases, especially if 6 is small. This effect is called resonance. The purpose of 
this exercise is to explore resonance. 


a) Figure out how the solver function in vib. py can be called for the scaled ODE 
(1.122). 

b) Run y = 5,1.5,1.1,1 for B = 0.005, 0.05, 0.2. For each £ value, present an 
image with plots of u(t) for the four y values. 


Filename: resonance. 


Exercise 1.23: Simulate oscillations of a sliding box 
Consider a sliding box on a flat surface as modeled in Sect. 1.12.3. As spring force 
we choose the nonlinear formula 


k 1 2 
s(u) = —tanh(au) = ku + zo ku! + peu ku + O(u°). 
a 


a) Plot g(u) = a! tanh(œu) for various values of œ. Assume u € [—1, 1]. 

b) Scale the equations using J as scale for u and ,/m/k as time scale. 

c) Implement the scaled model in b). Run it for some values of the dimensionless 
parameters. 


Filename: sliding_box. 


Exercise 1.24: Simulate a bouncing ball 

Section 1.12.9 presents a model for a bouncing ball. Choose one of the two ODE 
formulation, (1.151) or (1.152)—(1.153), and simulate the motion of a bouncing ball. 
Plot A(t). Think about how to plot v(t). 
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Hint A naive implementation may get stuck in repeated impacts for large time step 
sizes. To avoid this situation, one can introduce a state variable that holds the mode 
of the motion: free fall, impact, or rest. Two consecutive impacts imply that the 
motion has stopped. 

Filename: bouncing_ball. 


Exercise 1.25: Simulate a simple pendulum 

Simulation of simple pendulum can be carried out by using the mathematical model 
derived in Sect. 1.12.5 and calling up functionality in the vib. py file (i.e., solve the 
second-order ODE by centered finite differences). 


a) Scale the model. Set up the dimensionless governing equation for 0 and expres- 
sions for dimensionless drag and wire forces. 

b) Write a function for computing @ and the dimensionless drag force and the force 
in the wire, using the solver function in the vib.py file. Plot these three 
quantities below each other (in subplots) so the graphs can be compared. Run 
two cases, first one in the limit of © small and no drag, and then a second one 
with © = 40 degrees anda = 0.8. 


Filename: simple_pendulun. 

Exercise 1.26: Simulate an elastic pendulum 

Section 1.12.7 describes a model for an elastic pendulum, resulting in a system of 
two ODEs. The purpose of this exercise is to implement the scaled model, test the 


software, and generalize the model. 


a) Write a function simulate that can simulate an elastic pendulum using the 
scaled model. The function should have the following arguments: 


def simulate( 


beta=0.9, # dimensionless parameter 
Theta=30, # initial angle in degrees 
epsilon=0, # initial stretch of wire 
num_periods=6, # simulate for num_periods 
time_steps_per_period=60, # time step resolution 
plot=True, # make plots or not 

ye 


To set the total simulation time and the time step, we use our knowledge of the 
scaled, classical, non-elastic pendulum: u” + u = 0, with solution u = © cost. 
The period of these oscillations is P = 27 and the frequency is unity. The time 
for simulation is taken as num_periods times P. The time step is set as P 
divided by time_steps_per_period. 

The simulate function should return the arrays of x, y, 0, and t, where 0 = 
tan™!(x/(1 — y)) is the angular displacement of the elastic pendulum corre- 
sponding to the position (x, y). 

If plot is True, make a plot of y(t) versus X(f), i.e., the physical motion of 
the mass at (x, y). Use the equal aspect ratio on the axis such that we get a 
physically correct picture of the motion. Also make a plot of 6(f), where 6 is 
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measured in degrees. If © < 10 degrees, add a plot that compares the solutions 
of the scaled, classical, non-elastic pendulum and the elastic pendulum (0 (t)). 
Although the mathematics here employs a bar over scaled quantities, the code 
should feature plain names x for x, y for y, and t for f (rather than x_bar, 
etc.). These variable names make the code easier to read and compare with the 
mathematics. 


Hint I Equal aspect ratio is set by plt.gca() .set_aspect(’ equal’) in Mat- 
plotlib (import matplotlib.pyplot as plt) and in SciTools by the command 
plt.plot(..., daspect=[1,1,1], daspectmode=’equal’) (provided you 
have done import scitools.std as plt). 


Hint 2 If you want to use Odespy to solve the equations, order the ODEs like 
x, X, y, y such that odespy.EulerCromer can be applied. 


b) Write a test function for testing that © = 0 and € = 0 gives x = y = 0 forall 
times. 

c) Write another test function for checking that the pure vertical motion of the 

elastic pendulum is correct. Start with simplifying the ODEs for pure ver- 
tical motion and show that y(t) fulfills a vibration equation with frequency 
y 6/(1 — B). Set up the exact solution. 
Write a test function that uses this special case to verify the simulate func- 
tion. There will be numerical approximation errors present in the results from 
simulate so you have to believe in correct results and set a (low) tolerance that 
corresponds to the computed maximum error. Use a small Af to obtain a small 
numerical approximation error. 

d) Make a function demo (beta, Theta) for simulating an elastic pendulum with 
a given f parameter and initial angle ©. Use 600 time steps per period to get 
every accurate results, and simulate for 3 periods. 


Filename: elastic_pendulum. 


Exercise 1.27: Simulate an elastic pendulum with air resistance 

This is a continuation Exercise 1.26. Air resistance on the body with mass m can 
be modeled by the force —t0 Cp Al|v|v, where Cp is a drag coefficient (0.2 for a 
sphere), ọ is the density of air (1.2kgm™~), A is the cross section area (A = m R? 
for a sphere, where R is the radius), and v is the velocity of the body. Include air 
resistance in the original model, scale the model, write a function simulate_drag 
that is a copy of the simulate function from Exercise 1.26, but with the new ODEs 
included, and show plots of how air resistance influences the motion. 

Filename: elastic_pendulum_drag. 


Remarks Test functions are challenging to construct for the problem with air resis- 
tance. You can reuse the tests from Exercise 1.27 for simulate_drag, but these 
tests does not verify the new terms arising from air resistance. 
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Exercise 1.28: Implement the PEFRL algorithm 

We consider the motion of a planet around a star (Sect. 1.12.10). The simplified 
case where one mass is very much bigger than the other and one object is at rest, 
results in the scaled ODE model 


K+? +y =O, 
F+W +yy Vy =O. 


a) It is easy to show that x(t) and y(t) go like sine and cosine functions. Use this 
idea to derive the exact solution. 

b) One believes that a planet may orbit a star for billions of years. We are now 
interested in how accurate methods we actually need for such calculations. A 
first task is to determine what the time interval of interest is in scaled units. Take 
the earth and sun as typical objects and find the characteristic time used in the 
scaling of the equations (te = y L3/(mG)), where m is the mass of the sun, L 
is the distance between the sun and the earth, and G is the gravitational constant. 
Find the scaled time interval corresponding to one billion years. 
Solve the equations using 4th-order Runge-Kutta and the Euler-Cromer meth- 
ods. You may benefit from applying Odespy for this purpose. With each solver, 
simulate 10,000 orbits and print the maximum position error and CPU time as 
a function of time step. Note that the maximum position error does not neces- 
sarily occur at the end of the simulation. The position error achieved with each 
solver will depend heavily on the size of the time step. Let the time step corre- 
spond to 200, 400, 800 and 1600 steps per orbit, respectively. Are the results as 
expected? Explain briefly. When you develop your program, have in mind that 
it will be extended with an implementation of the other algorithms (as requested 
in d) and e) later) and experiments with this algorithm as well. 

Implement a solver based on the PEFRL method from Sect. 1.10.11. Verify its 

4th-order convergence using an equation u” + u = 0. 

The simulations done previously with the 4th-order Runge-Kutta and Euler- 

Cromer are now to be repeated with the PEFRL solver, so the code must be 

extended accordingly. Then run the simulations and comment on the perfor- 

mance of PEFRL compared to the other two. 

f) Use the PEFRL solver to simulate 100,000 orbits with a fixed time step cor- 
responding to 1600 steps per period. Record the maximum error within each 
subsequent group of 1000 orbits. Plot these errors and fit (least squares) a math- 
ematical function to the data. Print also the total CPU time spent for all 100,000 
orbits. 

Now, predict the error and required CPU time for a simulation of 1 billion years 
(orbits). Is it feasible on today’s computers to simulate the planetary motion for 
one billion years? 


Cc 


wa 


d 


Ww 


e 


wm 


Filename: vib_PEFRL. 


Remarks This exercise investigates whether it is feasible to predict planetary mo- 
tion for the life time of a solar system. 
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Wave Equations 


A very wide range of physical processes lead to wave motion, where signals are 
propagated through a medium in space and time, normally with little or no per- 
manent movement of the medium itself. The shape of the signals may undergo 
changes as they travel through matter, but usually not so much that the signals can- 
not be recognized at some later point in space and time. Many types of wave motion 
can be described by the equation u; = V -(c?Vu) + f, which we will solve in the 
forthcoming text by finite difference methods. 


2.1 Simulation of Waves on a String 


We begin our study of wave equations by simulating one-dimensional waves on a 
string, say on a guitar or violin. Let the string in the undeformed state coincide with 
the interval [0, L] on the x axis, and let u(x,t) be the displacement at time f in the 
y direction of a point initially at x. The displacement function u is governed by the 
mathematical model 


Pu 4 0u 
az =o aan? x€(0,L), t €(0,T] (2.1) 
u(x,0) = I(x), x € [0, L] (2.2) 
= u(x,0) =0, x € [0, L] (2.3) 
u(0,t) = 0, t €(0,T] (2.4) 
u(L,t) = 0, t € (0, T]. (2.5) 


The constant c and the function 7 (x) must be prescribed. 

Equation (2.1) is known as the one-dimensional wave equation. Since this PDE 
contains a second-order derivative in time, we need two initial conditions. The 
condition (2.2) specifies the initial shape of the string, Z (x), and (2.3) expresses 
that the initial velocity of the string is zero. In addition, PDEs need boundary 
conditions, given here as (2.4) and (2.5). These two conditions specify that the 
string is fixed at the ends, i.e., that the displacement u is zero. 

The solution u(x, t) varies in space and time and describes waves that move with 
velocity c to the left and right. 
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Sometimes we will use a more compact notation for the partial derivatives to 
save space: 

du u 
~ OF Ut = 2’ 
and similar expressions for derivatives with respect to other variables. Then the 
wave equation can be written compactly as Us, = CUxx. 

The PDE problem (2.1)-(2.5) will now be discretized in space and time by a 
finite difference method. 


(2.6) 


2.1.1 Discretizing the Domain 
The temporal domain [0, T] is represented by a finite number of mesh points 
O=f) <t <th <: < ty- <ty, =T. (2.7) 
Similarly, the spatial domain [0, L] is replaced by a set of mesh points 
0 = xg <x <X0 <0 < XN, <Xy, = L. (2.8) 


One may view the mesh as two-dimensional in the x,t plane, consisting of points 
(xi, tn), withi = 0,..., Ny andn = 0,..., Ny. 


Uniform meshes For uniformly distributed mesh points we can introduce the con- 
stant mesh spacings At and Ax. We have that 


x; =I1Ax,i=0,...,N,, t, =nAt,n=0,...,N;. (2.9) 


We also have that Ax = x; — x;-1,i = 1,...,Ny, and At = tf, —t_1,n = 
1,..., N;. Figure 2.1 displays a mesh in the x,t plane with N; = 5, N, = 5, and 
constant mesh spacings. 


2.1.2 The Discrete Solution 


The solution u(x,t) is sought at the mesh points. We introduce the mesh func- 
tion uř, which approximates the exact solution at the mesh point (x;, tn) fori = 
0,..., Nx andn = 0,..., N;. Using the finite difference method, we shall develop 
algebraic equations for computing the mesh function. 


2.1.3 Fulfilling the Equation at the Mesh Points 


In the finite difference method, we relax the condition that (2.1) holds at all points 
in the space-time domain (0, L) x (0, T] to the requirement that the PDE is fulfilled 
at the interior mesh points only: 


2 2. 


0 07 
ga“ Xi tn) = c? Ui fn), (2.10) 
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fori = 1,...,N,—landn = 1,...,N,; —1. Forn = 0 we have the initial 
conditions u = I(x) and u, = 0, and at the boundaries i = 0, N, we have the 
boundary condition u = 0. 


2.1.4 Replacing Derivatives by Finite Differences 


The second-order derivatives can be replaced by central differences. The most 
widely used difference approximation of the second-order derivative is 


anes : 7 va = 2u” 4 u?! 
ar At? 


It is convenient to introduce the finite difference operator notation 


n+l -1 
[D D u]? = u; — 2u; F u; 
t t D At2 


A similar approximation of the second-order derivative in the x direction reads 


32 


gat i tn) 7X 


n n n 
Uj} — 2u} + uj 
Ax? 


= [D;Dxu]} . 


Algebraic version of the PDE We can now replace the derivatives in (2.10) and 
get 
utl — 2u? tug! guy — 2u} + uf ait 
At? Ax? i ` 


or written more compactly using the operator notation: 


[D;D;u = ° D; Dx] . (2.12) 


Interpretation of the equation as a stencil A characteristic feature of (2.11) is 
that it involves u values from neighboring points only: ae Uy u”, and ue! 

The circles in Fig. 2.1 illustrate such neighboring mesh points that contribute to an 
algebraic equation. In this particular case, we have sampled the PDE at the point 
(2,2) and constructed (2.11), which then involves a coupling of i, u3, u2, ul, and 
uż. The term stencil is often used about the algebraic equation at a mesh point, and 
the geometry of a typical stencil is illustrated in Fig. 2.1. One also often refers to 
the algebraic equations as discrete equations, (finite) difference equations or a finite 


difference scheme. 


Algebraic version of the initial conditions We also need to replace the deriva- 
tive in the initial condition (2.3) by a finite difference approximation. A centered 
difference of the type 


u! — u7! 
—u(xi, to) x #— = Daa, 
Tii 0) 2At [Dau]; 
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Stencil at interior point 
5 T T T T 


index n 


index i 


Fig. 2.1 Mesh in space and time. The circles show points connected in a finite difference equation 


seems appropriate. Writing out this equation and ordering the terms give 
i Ha, i=0,..., Ny. (2.13) 
The other initial condition can be computed by 


u? = I(x), i=0,..., Ny. 


2.1.5 Formulating a Recursive Algorithm 


We assume that u? and u?! are available for i = 0,..., Nx. The only unknown 
quantity in (2.11) is therefore u"*! which we now can solve for: 


i 


utt! = ut! + 2u? + C? (ul — 2u? + ut). (2.14) 


t 


We have here introduced the parameter 
C =c—, (2.15) 


known as the Courant number. 


C is the key parameter in the discrete wave equation 

We see that the discrete version of the PDE features only one parameter, C, 
which is therefore the key parameter, together with N,, that governs the quality 
of the numerical solution (see Sect. 2.10 for details). Both the primary physical 
parameter c and the numerical parameters Ax and At are lumped together in C. 
Note that C is a dimensionless parameter. 
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Stencil at interior point 
5 T T T T 


index n 


i+ O J 
C a a 4 5 


index i 


Fig.2.2 Modified stencil for the first time step 


Given that u?! and u? are known for i = 0,..., Nx, we find new values at the 
next time level by applying the formula (2.14) for? = 1,..., Ny — 1. Figure 2.1 
illustrates the points that are used to compute uż. For the boundary points, i = 0 
andi = Nx, we apply the boundary conditions u”! = 0. 

Even though sound reasoning leads up to (2.14), there is still a minor challenge 
with it that needs to be resolved. Think of the very first computational step to 
be made. The scheme (2.14) is supposed to start at n = 1, which means that 
we compute u? from u! and u?. Unfortunately, we do not know the value of u!, 
so how to proceed? A standard procedure in such cases is to apply (2.14) also 
for n = 0. This immediately seems strange, since it involves u; l which is an 
undefined quantity outside the time mesh (and the time domain). However, we can 
use the initial condition (2.13) in combination with (2.14) when n = 0 to eliminate 


u7! and arrive at a special formula for u}: 


1 


Figure 2.2 illustrates how (2.16) connects four instead of five points: ul, ul, u9, and 
ud, 
We can now summarize the computational algorithm: 


1. Compute u? = I(x;) fori = 0,..., Ny 

2. Compute ul by (2.16) fori = 1,2,..., Nx — 1 and set ul = 0 for the boundary 
points given by į = O and i = Ny, 

3. For each time level n = 1,2,..., N; — 1 
(a) apply (2.14) to find oe fori = 1,...,N,—-—1 
(b) set ut! = 0 for the boundary points having i = 0, i = Ny. 


98 2 Wave Equations 


The algorithm essentially consists of moving a finite difference stencil through all 
the mesh points, which can be seen as an animation in a web page! or a movie file’. 


2.1.6 Sketch of an Implementation 


The algorithm only involves the three most recent time levels, so we need only 
three arrays for a ut, and ae i = 0,...,N,. Storing all the solutions in 
a two-dimensional array of size (Ny + 1) x (N, + 1) would be possible in this 
simple one-dimensional PDE problem, but is normally out of the question in three- 
dimensional (3D) and large two-dimensional (2D) problems. We shall therefore, in 
all our PDE solving programs, have the unknown in memory at as few time levels 
as possible. 

In a Python implementation of this algorithm, we use the array elements u [i] to 
store we u_n[i] to store u7, and u_nm1 [i] to store utm, 


The following Python snippet realizes the steps in the computational algorithm. 


# Given mesh points as arrays x and t (x[i], t[n]) 
dx = x[1] - x[0] 
dt = t[1] - t[0] 


C = cxdt/dx # Courant number 
Nt = len(t)-1 
C2 = Cx*2 # Help variable in the scheme 


# Set initial condition u(x,0) = I(x) 
for i in range(0O, Nx+1): 
u_n[i] = I(x[i]) 


# Apply special formula for first step, incorporating du/dt=0 
for i in range(1, Nx): 
uli] = u_n[i] - \ 
0.5*C**2(u_n[it1] - 2*u_n[i] + u_n[i-1]) 
u[0] = 0; u[Nx] = 0 # Enforce boundary conditions 


# Switch variables before next step 
u_nmi[:], u_n[:] =u_n, u 


for n in range(1, Nt): 
# Update all inner mesh points at time t[n+1] 
for i in range(1, Nx): 
uli] = 2u_n[i] = u_nmifijJ = \ 
C**2(u_n[iti] - 2*u_n[i] + u_n[i-1]) 


# Insert boundary conditions 
u[0] = 0; ul[Nx] = 0 


# Switch variables before next step 
PU sicil Eele wi Ea S whan, wl 


! http://tinyurl.com/hbcasmj/book/html/mov-wave/D_stencil_gpl/index.html 
? http://tinyurl.com/gokgkov/mov-wave/D_stencil_gpl/movie.ogg 
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2.2 Verification 

Before implementing the algorithm, it is convenient to add a source term to the PDE 
(2.1), since that gives us more freedom in finding test problems for verification. 
Physically, a source term acts as a generator for waves in the interior of the domain. 


2.2.1 A Slightly Generalized Model Problem 


We now address the following extended initial-boundary value problem for one- 
dimensional wave phenomena: 


Urn = Uxx + f (x,t), x € (0, L), 1207) (2.17) 
u(x,0) = I(x), x € [0, L] (2.18) 
u(x, 0) = V(x), x € [0, L] (2.19) 
u(0,t) = 0, t>0 (2.20) 
u(L,t) = 0, t>0. (2.21) 


Sampling the PDE at (x; , t„) and using the same finite difference approximations 
as above, yields 


[D,;D,u = °D; Dyu + fI}. (2.22) 
Writing this out and solving for the unknown ie results in 
utt! = u?! + 2u? + CP (ul, —2u? +u_,) + A? fF". (2.23) 


The equation for the first time step must be rederived. The discretization of the 
initial condition u; = V(x) att = 0 becomes 


[Dyu =V]? => uz! =u} —2Aty, 


which, when inserted in (2.23) for n = 0, gives the special formula 


1 1 
ul = u? — At V; + so (up, — 2u? +u?) + SAPS. (2.24) 


2.2.2 Using an Analytical Solution of Physical Significance 


Many wave problems feature sinusoidal oscillations in time and space. For example, 
the original PDE problem (2.1)—(2.5) allows an exact solution 


Ue(x,t) = Asin (37) cos (Fer) : (2.25) 


This ue fulfills the PDE with f = 0, boundary conditions ue(0, t) = ue(L,t) = 0, 
as well as initial conditions (x) = A sin (x) and V = 0. 
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How to use exact solutions for verification 

It is common to use such exact solutions of physical interest to verify imple- 
mentations. However, the numerical solution u? will only be an approximation 
to Ue(X;,t,). We have no knowledge of the precise size of the error in this ap- 
proximation, and therefore we can never know if discrepancies between u? and 
ue(Xi, fn) are caused by mathematical approximations or programming errors. 
In particular, if plots of the computed solution u? and the exact one (2.25) look 
similar, many are tempted to claim that the implementation works. However, 
even if color plots look nice and the accuracy is “deemed good”, there can still 
be serious programming errors present! 

The only way to use exact physical solutions like (2.25) for serious and thor- 
ough verification is to run a series of simulations on finer and finer meshes, 
measure the integrated error in each mesh, and from this information estimate 
the empirical convergence rate of the method. 


An introduction to the computing of convergence rates is given in Section 
3.1.6 in [9]. There is also a detailed example on computing convergence rates in 
Sect. 1.2.2. 

In the present problem, one expects the method to have a convergence rate of 2 
(see Sect. 2.10), so if the computed rates are close to 2 on a sufficiently fine mesh, 
we have good evidence that the implementation is free of programming mistakes. 


2.2.3 Manufactured Solution and Estimation of Convergence Rates 


Specifying the solution and computing corresponding data One problem with 
the exact solution (2.25) is that it requires a simplification (V = 0, f = 0) of 
the implemented problem (2.17)-(2.21). An advantage of using a manufactured 
solution is that we can test all terms in the PDE problem. The idea of this approach 
is to set up some chosen solution and fit the source term, boundary conditions, 
and initial conditions to be compatible with the chosen solution. Given that our 
boundary conditions in the implementation are u(0,t) = u(L,t) = 0, we must 
choose a solution that fulfills these conditions. One example is 


Ue(x,t) = x(L—x)sint. 
Inserted in the PDE u,, = c?u,,. + f we get 
—x(L—x)sint =-c?2sint+ f => f =(2c?—x(L—x))sint. 
The initial conditions become 


u(x,0) =I(x) = 0, 
u,(x,0) = V(x) = x(L—x). 
Defining a single discretization parameter To verify the code, we compute the 


convergence rates in a series of simulations, letting each simulation use a finer mesh 
than the previous one. Such empirical estimation of convergence rates relies on an 
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assumption that some measure F of the numerical error is related to the discretiza- 
tion parameters through 
E = C, At" + C,Ax?, 


where C;, Cx, r, and p are constants. The constants r and p are known as the 
convergence rates in time and space, respectively. From the accuracy in the finite 
difference approximations, we expect r = p = 2, since the error terms are of order 
At* and Ax?. This is confirmed by truncation error analysis and other types of 
analysis. 

By using an exact solution of the PDE problem, we will next compute the error 
measure E on a sequence of refined meshes and see if the rates r = p = 2 are 
obtained. We will not be concerned with estimating the constants C, and Cy, simply 
because we are not interested in their values. 

It is advantageous to introduce a single discretization parameter h = At = CAx 
for some constant ¢. Since At and Ax are related through the Courant number, 
At = CAx/c, we set h = At, and then Ax = hc/C. Now the expression for the 
error measure is greatly simplified: 


E=C,At' + C,Ax’ =C,;h" + C, (Eyr ae eae. G 


Computing errors We choose an initial discretization parameter ho and run ex- 
periments with decreasing h: h; = 2™ho, i = 1,2,...,m. Halving h in each 
experiment is not necessary, but it is a common choice. For each experiment we 


must record E and h. Standard choices of error measure are the £? and 4% norms 
of the error mesh function e7: 


1 


x 2 


N, N 

E = |letlle = (AtA ey) . e = elit) u, (2.26) 
n=0 i=0 

E = ||e; || = max |e? |. (2.27) 


In Python, one can compute 0; (e")? at each time step and accumulate the 
value in some sum variable, say e2_sum. At the final time step one can do 
sqrt (dt*dx*e2_sum). For the £% norm one must compare the maximum er- 
ror at a time level (e.max()) with the global maximum over the time domain: 
e_max = max(e_max, e.max()). 

An alternative error measure is to use a spatial norm at one time step only, e.g., 
the end time T (n = N,): 


Ny 2 
E = |le?|la2 = (Ax e] . ef = ueli, tn) — u}, (2.28) 
i=0 
E =|le?|\eo = max |e?|. (2.29) 
OSi<Nx 


The important point is that the error measure (£) for the simulation is represented 
by a single number. 
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Computing rates Let E; be the error measure in experiment (mesh) number i 
(not to be confused with the spatial index i) and let h; be the corresponding dis- 
cretization parameter (h). With the error model E; = Dh’, we can estimate r by 
comparing two consecutive experiments: 


Ej41 = Dh} ,,, 
E; = Dh;. 


Dividing the two equations eliminates the (uninteresting) constant D. Thereafter, 
solving for r yields 


In Ej41/E; 
r= —_— 
In hisa [hi 
Since r depends on i, i.e., which simulations we compare, we add an index to r: 
ri, where i = 0,...,m—2, if we have m experiments: (ho, Eo),...,(Am—1, Em_1)- 


In our present discretization of the wave equation we expect r = 2, and hence 
the r; values should converge to 2 as 7 increases. 


2.2.4 Constructing an Exact Solution of the Discrete Equations 


With a manufactured or known analytical solution, as outlined above, we can esti- 
mate convergence rates and see if they have the correct asymptotic behavior. Expe- 
rience shows that this is a quite good verification technique in that many common 
bugs will destroy the convergence rates. A significantly better test though, would 
be to check that the numerical solution is exactly what it should be. This will in 
general require exact knowledge of the numerical error, which we do not normally 
have (although we in Sect. 2.10 establish such knowledge in simple cases). How- 
ever, it is possible to look for solutions where we can show that the numerical error 
vanishes, i.e., the solution of the original continuous PDE problem is also a solution 
of the discrete equations. This property often arises if the exact solution of the PDE 
is a lower-order polynomial. (Truncation error analysis leads to error measures that 
involve derivatives of the exact solution. In the present problem, the truncation error 
involves 4th-order derivatives of u in space and time. Choosing u as a polynomial 
of degree three or less will therefore lead to vanishing error.) 

We shall now illustrate the construction of an exact solution to both the PDE 
itself and the discrete equations. Our chosen manufactured solution is quadratic in 
space and linear in time. More specifically, we set 


Ue(X,t) = x(L—x) (: + z") ; (2.30) 


which by insertion in the PDE leads to f(x,t) = 2(1 + t)c?. This ue fulfills the 
boundary conditions u = 0 and demands /(x) = x(L— x) and V(x) = ix(L —Xx). 
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To realize that the chosen ue is also an exact solution of the discrete equations, 
we first remind ourselves that t, = nAt so that 


t —20?4 2? 
[D;D,07]" = er es =(n+1)-2n?+(n-1%=2, (2.31) 
tnt — 2th + bn- ((n F 1) — 2n + (n 7 1))At 
D,D,t)" = = =0. (2.32 
[ tt ] At2 At2 ( ) 
Hence, 


1 i 1 
[D; Drue]; = xı(L — xi) |D., (: + x)| = xi(L = xi) DDt] => 0. 
Similarly, we get that 


[Dx Dxue]; = (: + 


n _ 1 2 ; i 
Now, f; = 2(1 + 5fa)c^, which results in 
2 n 2 1 1 2 
[D,D,ue —c° Dy Dyue — f]; =O+e°2( 14+ z” +2(1+ z” c=. 


Moreover, ue(x;,0) = I(x;), ðue/Ət = V(x;) att = 0, and ue(xo,t) = 
Ue(Xy,,0) = 0. Also the modified scheme for the first time step is fulfilled by 
Ue(Xj, tn). 

Therefore, the exact solution ue(x, t) = x(L—x)(1+t/2) of the PDE problem is 
also an exact solution of the discrete problem. This means that we know beforehand 
what numbers the numerical algorithm should produce. We can use this fact to 
check that the computed u? values from an implementation equals ue(x;, tn), within 
machine precision. This result is valid regardless of the mesh spacings Ax and At! 
Nevertheless, there might be stability restrictions on Ax and At, so the test can 
only be run for a mesh that is compatible with the stability criterion (which in the 
present case is C < 1, to be derived later). 


Notice 

A product of quadratic or linear expressions in the various independent variables, 

as shown above, will often fulfill both the PDE problem and the discrete equa- 

tions, and can therefore be very useful solutions for verifying implementations. 
However, for 1D wave equations of the type ur = c?ux» we shall see that 

there is always another much more powerful way of generating exact solutions 

(which consists in just setting C = 1 (!), as shown in Sect. 2.10). 
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2.3 Implementation 


This section presents the complete computational algorithm, its implementation in 
Python code, animation of the solution, and verification of the implementation. 

A real implementation of the basic computational algorithm from Sect. 2.1.5 
and 2.1.6 can be encapsulated in a function, taking all the input data for the problem 
as arguments. The physical input data consists of c, I(x), V(x), f(x,t), L, and T. 
The numerical input is the mesh parameters At and Ax. 

Instead of specifying At and Ax, we can specify one of them and the Courant 
number C instead, since having explicit control of the Courant number is conve- 
nient when investigating the numerical method. Many find it natural to prescribe 
the resolution of the spatial grid and set Ny. The solver function can then compute 
At = CL/(cN,). However, for comparing u(x,t) curves (as functions of x) for 
various Courant numbers it is more convenient to keep Af fixed for all C and let 
Ax vary according to Ax = cAt/C. With At fixed, all frames correspond to the 
same time ¢, and this simplifies animations that compare simulations with different 
mesh resolutions. Plotting functions of x with different spatial resolution is trivial, 
so it is easier to let Ax vary in the simulations than Af. 


2.3.1 Callback Function for User-Specific Actions 


The solution at all spatial points at a new time level is stored in an array u of length 
Nx + 1. We need to decide what to do with this solution, e.g., visualize the curve, 
analyze the values, or write the array to file for later use. The decision about what 
to do is left to the user in the form of a user-supplied function 


user_action(u, x, t, n) 


where u is the solution at the spatial points x at time t[n]. The user_action 
function is called from the solver at each time level n. 

If the user wants to plot the solution or store the solution at a time point, she 
needs to write such a function and take appropriate actions inside it. We will show 
examples on many such user_action functions. 

Since the solver function makes calls back to the user’s code via such a func- 
tion, this type of function is called a callback function. When writing general 
software, like our solver function, which also needs to carry out special problem- or 
solution-dependent actions (like visualization), it is a common technique to leave 
those actions to user-supplied callback functions. 

The callback function can be used to terminate the solution process if the user 
returns True. For example, 


def my_user_action_function(u, x, t, n): 
return np.abs(u).max() > 10 
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is a callback function that will terminate the solver function (given below) of the 
amplitude of the waves exceed 10, which is here considered as a numerical insta- 
bility. 


2.3.2 The Solver Function 


A first attempt at a solver function is listed below. 


import numpy as np 


def solver(I, V, f, c, L, dt, C, T, user_action=None): 
WU olve u eile PEAUL zx Gp ae on (CO) DxO agli 
Nt = int (round(T/dt)) 
t = np.linspace(0, Nt*dt, Nt+1) # Mesh points in time 
dx = dt*c/float(C) 
Nx = int (round(L/dx) ) 
x = np.linspace(0, L, Nx+1) # Mesh points in space 
C2 = Cx*2 # Help variable in the scheme 
# Make sure dx and dt are compatible with x and t 
cbs = a = O 
dite emi eo] 


abe Ge Gls) NEG Gre a = 0; 
f = lambda x, t: 0 
if V is None or V == 0: 
V = lambda x: 0 
u = np.zeros(Nx+1) # Solution array at new time level 
u_n = np.zeros(Nx+1)  # Solution at 1 time level back 


u_nmi = np.zeros(Nx+1)  # Solution at 2 time levels back 
import time; tO = time.clock() # Measure CPU time 


# Load initial condition into u_n 
for i in range(0,Nxt1): 
u nlil = TCH 


if user_action is not None: 
user_action(u_n, x, t, 0) 


# Special formula for first time step 
n=0 
for i in range(1, Nx): 
uli] = u_n[i] + dt*V(x[i]) + \ 
0.5*C2*(u_n[i-1] - 2*u_n[i] + u_n[i+1]) + \ 
0.5*dt**2*f(x[i], t[n]) 
u[0] = 0; ul[Nx] = 0 


if user_action is not None: 
user_action(u, x, t, 1) 


# Switch variables before next step 
u_nmi[:] Sun; unl] =u 
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for n in range(1, Nt): 
# Update all inner points at time t[n+1] 
for i in range(1, Nx): 
uli] = - u_nmi[i] + 2*u_n[i] + \ 
C2*(u_n[i-1] - 2*u_n[i] + u_n[iti]) + \ 
dt**2*f(x[i], t[n]) 


# Insert boundary conditions 
u[0] = 0; u[Nz] = 0 
if user_action is not None: 
if Usereact lon, x. ty ntl): 
break 


# Switch variables before next step 
u_nmi[:] = u_n; uonl[:] =u 


cpu_time = time.clock() - t0 
return u, x, t, cpu_time 


A couple of remarks about the above code is perhaps necessary: 


e Although we give dt and compute dx via C and c, the resulting t and x meshes 
do not necessarily correspond exactly to these values because of rounding errors. 
To explicitly ensure that dx and dt correspond to the cell sizes in x and t, we 
recompute the values. 

e According to the particular choice made in Sect. 2.3.1, a true value returned from 
user_action should terminate the simulation. This is here implemented by a 
break statement inside the for loop in the solver. 


2.3.3 Verification: Exact Quadratic Solution 


We use the test problem derived in Sect. 2.2.1 for verification. Below is a unit test 
based on this test problem and realized as a proper test function compatible with the 
unit test frameworks nose or pytest. 


def test_quadratic(): 


"""Check that u(x,t)=x(L-x) (1+t/2) is exactly reproduced.""" 


def u_exact(x, t): 
return x*(L-x)*(1 + 0.5*t) 


def I(x): 
return u_exact(x, 0) 


def V(x): 
return 0.5*u_exact(x, 0) 


de tert Gaya): 
return 2*(1 + 0.5*t)*c**2 


2.3 Implementation 


2 
GS ia 
G = Os 
Nx = 6 # Very coarse mesh for this exact test 
dt = C*(L/Nx)/c 
T = 18 


def assert_no_error(u, x, t, n): 
u_e = u_exact(x, t[n]) 
diff = np.abs(u - u_e).max() 
tol = 1E-13 
assert diff < tol 


solver m We ae5 Cy Ib, Chey Cy Ws 
user_action=assert_no_error) 


107 


When this function resides in the file wave1D_u0.py, one can run pytest to check 


that all test functions with names test_*() in this file work: 


Terminal 


Terminal> py.test -s -v waveiD_u0.py 


2.3.4 Verification: Convergence Rates 


A more general method, but not so reliable as a verification method, is to compute 
the convergence rates and see if they coincide with theoretical estimates. Here we 
expect a rate of 2 according to the various results in Sect. 2.10. A general function 


for computing convergence rates can be written like this: 


def convergence_rates( 
u_exact, # Python function for exact solution 
Ae Wis amy, ely E; # physical parameters 
dtO, num_meshes, C, T): # numerical parameters 
woe 
Half the time step and estimate convergence rates for 
for num_meshes simulations. 
nan 
# First define an appropriate user action function 
global error 
error = 0 # error computed in the user action function 


def compute_error(u, x, t, n): 

global error # must be global to be altered here 

# (otherwise error is a local variable, different 

# from error defined in the parent function) 

if n == 
error 

else: 
error = max(error, np.abs(u - u_exact(x, t[n])).max()) 


0 
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# Run finer and finer resolutions and compute true errors 
mS [Ey 

h = [] # dt, solver adjusts dx such that C=dt*c/dx 

dt = dt0 


for i in range(num_meshes) : 
solver aV fi, C5 Ihy Che, Gy WU, 
user_action=compute_error) 
# error is computed in the final call to compute_error 
E. append (error) 
h. append (dt) 
dt /= 2 # halve the time step for next simulation 
print EATE 
printa ben 
# Convergence rates for two consecutive experiments 
r = [np.log(E[i] /E[i-1]) /np.log(h[i] /h[i-1]) 
for i in range(1,num_meshes) ] 
return r 


Using the analytical solution from Sect. 2.2.2, we can call convergence_rates 
to see if we get a convergence rate that approaches 2 and use the final estimate of the 
rate in an assert statement such that this function becomes a proper test function: 


def test_convrate_sincos(): 
n=m=2 
Loe 10 
u_exact = lambda x, t: np.cos(m*np.pi/L*t)*np.sin(m*np.pi/L*x) 


r = convergence_rates( 
u_exact=u_exact, 
I=lambda x: u_exact(x, 0), 
V=lambda x: 0, 


num_meshes=6, 
c=0.9, 
T=1) 
print ’rates sin(x)*cos(t) solution:’, \ 
[round(r_,2) for r_ in r] 
assert abs(r[-1] - 2) < 0.002 


Doing py.test -s -v wave1D_u0.py will run also this test function and show 
the rates 2.05, 1.98, 2.00, 2.00, and 2.00 (to two decimals). 


2.3.5 Visualization: Animating the Solution 


Now that we have verified the implementation it is time to do a real computation 
where we also display evolution of the waves on the screen. Since the solver 
function knows nothing about what type of visualizations we may want, it calls 
the callback function user_action(u, x, t, n). We must therefore write this 
function and find the proper statements for plotting the solution. 


2.3 Implementation 


Function for administering the simulation The following viz function 
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1. defines a user_action callback function for plotting the solution at each time 


level, 


2. calls the solver function, and 
3. combines all the plots (in files) to video in different formats. 


def viz( 
IV tei Le dt C r E PDE parameters 
umin, umax, # Interval for u in plots 
animate=True, # Simulation with animation? 
tool=’matplotlib’, # ’matplotlib’ or ’scitools’ 
solver_function=solver, # Function with numerical algorithm 


De 


"""Run solver and visualize u at each time level.""" 


def pllotaulst Cube z, t Di 


"""user_action function for solver." Y" 
pltrplot Ga My ro, 

xlabel=’x’, ylabel=’u’, 

axis=[0, L, umin, umax], 

title=’t=%/f’ % t[n], show=True) 
# Let the initial condition stay on the screen for 2 
# seconds, else insert a pause of 0.2 s between each plot 
time.sleep(2) if t[n] == 0 else time.sleep(0.2) 
plt.savefig(’frame_/04d.png’ % n) # for movie making 


class PlotMatplotlib: 


def call (self, u, x, t, n): 
"""user_action function for solver.""" 
if n == 
plt iong 
self.lines = plt.plot(x, u, ’r-’) 
plt.xlabel(’x’); plt.ylabel(’u’) 
plt.axis([0, L, umin, umax]) 
plt.legend([’t=%f£’ % t[n]], loc=’lower left’) 
else: 
self .lines[0].set_ydata(u) 
plt.legend([’t=%f£’ % t[n]], loc=’lower left’) 
plt.draw() 
time.sleep(2) if t[n] == 0 else time.sleep(0.2) 
plt.savefig(’tmp_/04d.png’ % n) # for movie making 


if tool == ’matplotlib’: 


import matplotlib.pyplot as plt 
plot_u = PlotMatplotlib(@ 


elif tool == ’scitools’: 


import scitools.std as plt # scitools.easyviz interface 
plot_u = plot_u_st 


import time, glob, os 


# Clean up old movie frames 
for filename in glob.glob(’tmp_*.png’): 


os.remove (filename) 
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# Call solver and do the simulation 
user_action = plot_u if animate else None 
u, x, t, cpu = solver_function( 

Ig Wy 385 Gay Ib, Glib, Cy Wy Were action) 


# Make video files 
fps = 4 # frames per second 
codec2ext = dict(flv=’flv’, libx264=’mp4’, libvpx=’webm’, 
libtheora=’ogg’) # video formats 
filespec = ’tmp_/04d.png’ 
movie_program = ’ffmpeg’ # or ’avconv’ 
for codec in codec2ext: 
ext = codec2ext [codec] 
cmd = ’%(movie_program)s -r %(fps)d -i %(filespec)s ’\ 
»-vcodec %(codec)s movie.4(ext)s’ % vars() 
os.system(cmd) 


if tool == ’scitools’: 
# Make an HTML play for showing the animation in a browser 
plt.movie(’tmp_*.png’, encoder=’html’, fps=fps, 
output_file=’movie.html’) 
return cpu 


Dissection of the code The viz function can either use SciTools or Matplotlib for 
visualizing the solution. The user_action function based on SciTools is called 
plot_u_st, while the user_action function based on Matplotlib is a bit more 
complicated as it is realized as a class and needs statements that differ from those 
for making static plots. SciTools can utilize both Matplotlib and Gnuplot (and many 
other plotting programs) for doing the graphics, but Gnuplot is a relevant choice for 
large N, or in two-dimensional problems as Gnuplot is significantly faster than 
Matplotlib for screen animations. 

A function inside another function, like plot_u_st in the above code segment, 
has access to and remembers all the local variables in the surrounding code in- 
side the viz function (!). This is known in computer science as a closure and is 
very convenient to program with. For example, the plt and time modules de- 
fined outside plot_u are accessible for plot_u_st when the function is called (as 
user_action) in the solver function. Some may think, however, that a class in- 
stead of a closure is a cleaner and easier-to-understand implementation of the user 
action function, see Sect. 2.8. 

The plot_u_st function just makes a standard SciTools plot command for 
plotting u as a function of x at time t [n]. To achieve a smooth animation, the plot 
command should take keyword arguments instead of being broken into separate 
calls to xlabel, ylabel, axis, time, and show. Several plot calls will automati- 
cally cause an animation on the screen. In addition, we want to save each frame in 
the animation to file. We then need a filename where the frame number is padded 
with zeros, here tmp_0000. png, tmp_0001. png, and so on. The proper printf con- 
struction is then tmp_%04d. png. Section 1.3.2 contains more basic information on 
making animations. 

The solver is called with an argument plot_u as user_function. If the user 
chooses to use SciTools, plot_u is the plot_u_st callback function, but for Mat- 
plotlib it is an instance of the class PlotMatplot1lib. Also this class makes use of 
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variables defined in the viz function: plt and time. With Matplotlib, one has to 
make the first plot the standard way, and then update the y data in the plot at every 
time level. The update requires active use of the returned value from plt .plot in 
the first plot. This value would need to be stored in a local variable if we were to 
use a closure for the user_action function when doing the animation with Mat- 
plotlib. It is much easier to store the variable as a class attribute self . lines. Since 
the class is essentially a function, we implement the function as the special method 
__call__ such that the instance plot_u(u, x, t, n) canbe called as a standard 
callback function from solver. 


Making movie files From the frame_*. png files containing the frames in the ani- 
mation we can make video files. Section 1.3.2 presents basic information on how to 
use the ffmpeg (or avconv) program for producing video files in different modern 
formats: Flash, MP4, Webm, and Ogg. 

The viz function creates an ffmpeg or avconv command with the proper ar- 
guments for each of the formats Flash, MP4, WebM, and Ogg. The task is greatly 
simplified by having a codec2ext dictionary for mapping video codec names to 
filename extensions. As mentioned in Sect. 1.3.2, only two formats are actually 
needed to ensure that all browsers can successfully play the video: MP4 and WebM. 

Some animations having a large number of plot files may not be properly com- 
bined into a video using ffmpeg or avconv. A method that always works is to play 
the PNG files as an animation in a browser using JavaScript code in an HTML file. 
The SciTools package has a function movie (or a stand-alone command scitools 
movie) for creating such an HTML player. The plt .movie call in the viz function 
shows how the function is used. The file movie .htm1 can be loaded into a browser 
and features a user interface where the speed of the animation can be controlled. 
Note that the movie in this case consists of the movie .htm1 file and all the frame 
files tmp_*.png. 


Skipping frames for animation speed Sometimes the time step is small and T 
is large, leading to an inconveniently large number of plot files and a slow an- 
imation on the screen. The solution to such a problem is to decide on a total 
number of frames in the animation, num_frames, and plot the solution only for 
every skip_frame frames. For example, setting skip_frame=5 leads to plots of 
every 5 frames. The default value skip_frame=1 plots every frame. The total 
number of time levels (i.e., maximum possible number of frames) is the length of t, 
t.size (or len(t)), so if we want num_frames frames in the animation, we need 
to plot every t.size/num_frames frames: 


skip_frame = int(t.size/float (num_frames) ) 


if n % skip_frame == 0 or n == t.size-1: 
sto plot (s u, We? 4 oan?) 
The initial condition (n=0) is included byn % skip_frame == 0, as well as every 
skip_frame-th frame. Asn % skip_frame == 0 will very seldom be true for 
the very final frame, we must also check if n == t.size-1 to get the final frame 


included. 
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A simple choice of numbers may illustrate the formulas: say we have 801 frames 
in total (t . size) and we allow only 60 frames to be plotted. As n then runs from 
801 to 0, we need to plot every 801/60 frame, which with integer division yields 
13 as skip_frame. Using the mod function, n % skip_frame, this operation is 
zero every time n can be divided by 13 without a remainder. That is, the if test is 
true when n equals 0, 13,26, 39, ..., 780, 801. The associated code is included in 
the plot_u function, inside the viz function, in the file wave1D_u0. py. 


2.3.6 Running a Case 


The first demo of our 1D wave equation solver concerns vibrations of a string that 
is initially deformed to a triangular shape, like when picking a guitar string: 


ax/Xo, x < Xo, 


I(x) = a(L — x)/(L — xo), otherwise 


(2.33) 


We choose L = 75cm, xo = 0.8L, a = 5 mm, and a time frequency v = 440 Hz. 
The relation between the wave speed c and v is c = vA, where À is the wavelength, 
taken as 2L because the longest wave on the string forms half a wavelength. There 
is no external force, so f = 0 (meaning we can neglect gravity), and the string is 
at rest initially, implying V = 0. 

Regarding numerical parameters, we need to specify a At. Sometimes it is more 
natural to think of a spatial resolution instead of a time step. A natural semi-coarse 
spatial resolution in the present problem is N, = 50. We can then choose the 
associated Af (as required by the viz and solver functions) as the stability limit: 
At = L/(N,c). This is the At to be specified, but notice that if C < 1, the actual 
Ax computed in solver gets larger than L/N,: Ax = cAt/C = L/(N,C). (The 
reason is that we fix Aż and adjust Ax, so if C gets smaller, the code implements 
this effect in terms of a larger Ax.) 

A function for setting the physical and numerical parameters and calling viz in 
this application goes as follows: 


def guitar(C): 
"""Triangular wave (pulled guitar string).""" 
L = 0.75 
x0 = 0.8*L 
a = 0.005 
freq = 440 
wavelength = 2*L 
c = freq*wavelength 
omega = 2*pi*freq 
num_periods = 1 
T = 2*pi/omega*num_periods 
# Choose dt the same as the stability limit for Nx=50 
dt = L/50./c 


def I(x): 
return a*x/x0 if x < x0 else a/(L-x0)*(L-x) 


umin = -1.2*a; umax = -umin 
cepu- viz O TO Cy thy Choy Gy Wy umin, unax, 
animate=True, tool=’scitools’) 
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The associated program has the name wave1D_u0. py. Run the program and watch 
the movie of the vibrating string’. The string should ideally consist of straight 
segments, but these are somewhat wavy due to numerical approximation. Run the 
case with the wave1D_u0. py code and C = 1 to see the exact solution. 


2.3.7 Working with a Scaled PDE Model 


Depending on the model, it may be a substantial job to establish consistent and rel- 
evant physical parameter values for a case. The guitar string example illustrates the 
point. However, by scaling the mathematical problem we can often reduce the need 
to estimate physical parameters dramatically. The scaling technique consists of in- 
troducing new independent and dependent variables, with the aim that the absolute 
values of these lie in [0, 1]. We introduce the dimensionless variables (details are 
found in Section 3.1.1 in [11]) 


a * 2 Ê _ u 
x= >; => =f, u = —. 
L L a 
Here, L is a typical length scale, e.g., the length of the domain, and a is a typical 
size of u, e.g., determined from the initial condition: a = max, |Z (x)|. 
We get by the chain rule that 


ðu ə (ait) dt acou 
= —- (au = =e 
ot ot dt L ot 
Similarly, 
ou a ou 
dx L Ox’ 


Inserting the dimensionless variables in the PDE gives, in case f = 0, 


ac Fü arc? 07u 


L2 af2 L2 ax” 


Dropping the bars, we arrive at the scaled PDE 


2 2 
a 2 = x €(0,1),t€(@,cT/L), (2.34) 
Xx 


which has no parameter c? anymore. The initial conditions are scaled as 
au(x,0) = I(Lx) 


and m 
a uU ż 
tea = V(Lx), 


3 http://tinyurl.com/hbcasmj/wave/html/mov-wave/guitar_C0.8/movie.html 
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resulting in 


A : oe oy = ~via. 
ot ac 


a max, |J(x) 


In the common case V = O we see that there are no physical parameters to be 
estimated in the PDE model! 

If we have a program implemented for the physical wave equation with dimen- 
sions, we can obtain the dimensionless, scaled version by setting c = 1. The initial 
condition of a guitar string, given in (2.33), gets its scaled form by choosing a = 1, 
L = 1, and xo € [0,1]. This means that we only need to decide on the xo value 
as a fraction of unity, because the scaled problem corresponds to setting all other 
parameters to unity. In the code we can just set a=c=L=1, x0=0. 8, and there is no 
need to calculate with wavelengths and frequencies to estimate c! 

The only non-trivial parameter to estimate in the scaled problem is the final end 
time of the simulation, or more precisely, how it relates to periods in periodic so- 
lutions in time, since we often want to express the end time as a certain number of 
periods. The period in the dimensionless problem is 2, so the end time can be set to 
the desired number of periods times 2. 

Why the dimensionless period is 2 can be explained by the following reasoning. 
Suppose that u behaves as cos(wf) in time in the original problem with dimen- 
sions. The corresponding period is then P = 2z/q, but we need to estimate w. 
A typical solution of the wave equation is u(x,t) = Acos(kx)cos(wt), where A 
is an amplitude and k is related to the wave length A in space: A = 27/k. Both 
A and A will be given by the initial condition /(x). Inserting this u(x,t) in the 
PDE yields —w* = —c’k?, i.e., @ = kc. The period is therefore P = 27/(kc). 
If the boundary conditions are u(0,t) = u(L,t), we need to have kL = nz for 
integer n. The period becomes P = 2L/nc. The longest period is P = 2L/c. The 
dimensionless period P is obtained by dividing P by the time scale L/c, which 
results in P = 2. Shorter waves in the initial condition will have a dimensionless 
shorter period P = 2/n (n > 1). 


2.4 Vectorization 


The computational algorithm for solving the wave equation visits one mesh point 
at a time and evaluates a formula for the new value u at that point. Technically, 
this is implemented by a loop over array elements in a program. Such loops may 
run slowly in Python (and similar interpreted languages such as R and MATLAB). 
One technique for speeding up loops is to perform operations on entire arrays in- 
stead of working with one element at a time. This is referred to as vectorization, 
vector computing, or array computing. Operations on whole arrays are possible if 
the computations involving each element is independent of each other and therefore 
can, at least in principle, be performed simultaneously. That is, vectorization not 
only speeds up the code on serial computers, but also makes it easy to exploit paral- 
lel computing. Actually, there are Python tools like Numba‘ that can automatically 
turn vectorized code into parallel code. 


* http://numba.pydata.org 
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Fig. 2.3 Illustration of sub- 0 1 2 3 4 
tracting two slices of two SS SSSSSSSSSSS SS 
SSS SSS SSS 
S SSSSSSES 
= — 
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2.4.1 Operations on Slices of Arrays 


Efficient computing with numpy arrays demands that we avoid loops and compute 
with entire arrays at once (or at least large portions of them). Consider this calcula- 
tion of differences d; = uj4, — u;: 


n = u.size 
for i in range(0, n-1): 
d{i] = uliti] - ufil 


All the differences here are independent of each other. The computation of d can 
therefore alternatively be done by subtracting the array (wo,uW,...,Un—1) from 
the array where the elements are shifted one index upwards: (u1, u2,..., un), See 
Fig. 2.3. The former subset of the array can be expressed by u[0:n-1],u[0:-1], 
or just u[:-1], meaning from index 0 up to, but not including, the last element 
(-1). The latter subset is obtained by u[1:n] or u[1:], meaning from index 1 and 
the rest of the array. The computation of d can now be done without an explicit 
Python loop: 


ol wka = ws] 


or with explicit limits if desired: 


d = uli:n] - ul0:n-1] 


Indices with a colon, going from an index to (but not including) another index are 
called slices. With numpy arrays, the computations are still done by loops, but in 
efficient, compiled, highly optimized C or Fortran code. Such loops are sometimes 
referred to as vectorized loops. Such loops can also easily be distributed among 
many processors on parallel computers. We say that the scalar code above, working 
on an element (a scalar) at a time, has been replaced by an equivalent vectorized 
code. The process of vectorizing code is called vectorization. 
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Test your understanding 

Newcomers to vectorization are encouraged to choose a small array u, say with 
five elements, and simulate with pen and paper both the loop version and the 
vectorized version above. 


Finite difference schemes basically contain differences between array elements 
with shifted indices. As an example, consider the updating formula 


for i in range(1, n-1): 
u2[i] = u[i-1] - 2*u[i] + u[i+1] 


The vectorization consists of replacing the loop by arithmetics on slices of arrays 
of length n-2: 


u2 
u2 


mA = P] se AE] 
u[0:n-2] - 2*u[1:n-1] + u[2:n] # alternative 


Note that the length of u2 becomes n-2. If u2 is already an array of length n and we 
want to use the formula to update all the “inner” elements of u2, as we will when 
solving a 1D wave equation, we can write 


wale) = mA = e] se s] 
u2[1:n-1] = uflO:n-2] - 2*u[1:n-1] + ul[2:n] # alternative 


The first expression’s right-hand side is realized by the following steps, involving 
temporary arrays with intermediate results, since each array operation can only in- 
volve one or two arrays. The numpy package performs (behind the scenes) the first 
line above in four steps: 


temp1 = 2*u[1:-1] 
temp2 = u[:-2] - tempi 
temp3 = temp2 + u[2:] 
u2[1:-1] = temp3 


We need three temporary arrays, but a user does not need to worry about such 
temporary arrays. 


Common mistakes with array slices 
Array expressions with slices demand that the slices have the same shape. It easy 
to make a mistake in, e.g., 


u2[1:n-1] ulO:n-2] - 2*u[i:n-1] + u[2:n] 


and write 


u2[1:n-1] = uflO:n-2] - 2*u[1:n-1] + ul[1:n] 
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Now u[1:n] has wrong length (n-1) compared to the other array slices, causing 
a ValueError and the message could not broadcast input array from 
shape 103 into shape 104 (ifnis 105). When such errors occur one must 
closely examine all the slices. Usually, it is easier to get upper limits of slices 
right when they use -1 or -2 or empty limit rather than expressions involving 
the length. 

Another common mistake, when u2 has length n, is to forget the slice in the 
array on the left-hand side, 


u2 = ulO:n-2] - 2*u[i:n-1] + uli:n] 


This is really crucial: now u2 becomes a new array of length n-2, which is the 
wrong length as we have no entries for the boundary values. We meant to insert 
the right-hand side array into the original u2 array for the entries that correspond 
to the internal points in the mesh (1:n-1 or 1:-1). 


Vectorization may also work nicely with functions. To illustrate, we may extend 
the previous example as follows: 


def f(x): 
return x**2 + 1 


for i in range(1, n-1): 
vAu S wim] = a sp a <> si Gef 


Assuming u2, u, and x all have length n, the vectorized version becomes 


walle] = wills) = E <p mei) sp a Gas 


Obviously, f must be able to take an array as argument for f (x[1:-1]) to make 
sense. 


2.4.2 Finite Difference Schemes Expressed as Slices 


We now have the necessary tools to vectorize the wave equation algorithm as de- 
scribed mathematically in Sect. 2.1.5 and through code in Sect. 2.3.2. There are 
three loops: one for the initial condition, one for the first time step, and finally the 
loop that is repeated for all subsequent time levels. Since only the latter is repeated 
a potentially large number of times, we limit our vectorization efforts to this loop. 
Within the time loop, the space loop reads: 


for i in range(1, Nx): 
uli] = 2*u_n[i] - u_nmi[i] + \ 
C2*(u_n[i-1] - 2*u_n[i] + u_n[iti]) 
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The vectorized version becomes 


ul1:-1] - u_nmi[1:-1] + 2*u_n[1:-1] + \ 


C2*(u_n[:-2] - 2*u_n[1:-1] + u_n[2:]) 


or 


uli:Nx] = 2*u_n[1:Nx]- u_nmi[1:Nx] + \ 
C2*(u_n[0:Nx-1] - 2*u_n[1i:Nx] + u_n[2:Nx+1]) 


The program wave1D_u0v.py contains a new version of the function solver 
where both the scalar and the vectorized loops are included (the argument version 
is set to scalar or vectorized, respectively). 


2.4.3 Verification 


We may reuse the quadratic solution ue(x,t) = x(L — x)(1 + it) for verifying 
also the vectorized code. A test function can now verify both the scalar and the 
vectorized version. Moreover, we may use a user_action function that compares 
the computed and exact solution at each time level and performs a test: 


def test_quadratic(): 
nun 
Check the scalar and vectorized versions for 
a quadratic u(x,t)=x(L-x)(1+t/2) that is exactly reproduced. 
nun 
# The following function must work for x as array or scalar 
u_exact = lambda x, t: x*(L - x)*(1 + 0.5*t) 
I = lambda x: u_exact(x, 0) 


V = lambda x: 0.5*u_exact(x, 0) 

# f is a scalar (zeros_like(x) works for scalar x too) 
f = lambda x, t: np.zeros_like(x) + 2*c**2*(1 + 0.5*t) 
L = 2.5 

c= 1.5 

GS O27 


Nx = 3 # Very coarse mesh for this exact test 
dt = C*(L/Nx)/c 
Le iks) 


def assert_no_error(u, x, t, n): 
u_e = u_exact(x, t[n]) 
oll He alas) 
diff = np.abs(u - u_e).max() 
assert diff < tol 


solver H We a8) Co thy Chey Gy i, 
user_action=assert_no_error, version=’scalar’) 

Boller, Wy 3%) Cy Ihy Clb, Gy IW, 
user_action=assert_no_error, version=’vectorized’) 
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Lambda functions 

The code segment above demonstrates how to achieve very compact code, with- 
out degraded readability, by use of lambda functions for the various input pa- 
rameters that require a Python function. In essence, 


f = lambda x, t: L*(x-t)**2 


is equivalent to 


def f(x, t): 
return L(x-t)**2 


Note that lambda functions can just contain a single expression and no state- 
ments. 
One advantage with lambda functions is that they can be used directly in calls: 


solver (I=lambda x: sin(pi*x/L), V=0, f=0, ...) 


2.4.4 Efficiency Measurements 


The wave1D_u0v. py contains our new solver function with both scalar and vec- 
torized code. For comparing the efficiency of scalar versus vectorized code, we need 
a viz function as discussed in Sect. 2.3.5. All of this viz function can be reused, ex- 
cept the call to solver_function. This call lacks the parameter version, which 
we want to set to vectorized and scalar for our efficiency measurements. 

One solution is to copy the viz code from wave1D_u0 into wave1D_u0v. py 
and add a version argument to the solver_function call. Taking into account 
how much animation code we then duplicate, this is not a good idea. Alternatively, 
introducing the version argument in wave1D_u0. viz, so that this function can be 
imported into wave1D_uO0v. py, is not a good solution either, since version has no 
meaning in that file. We need better ideas! 


Solution 1 Calling viz in wave1D_u0 with solver_function as our new solver 
in waveiD_uOv works fine, since this solver has version=’vectorized’ as de- 
fault value. The problem arises when we want to test version=’scalar’. The 
simplest solution is then to use waveiD_u0.solver instead. We make a new 
viz function in wave1D_u0v.py that has a version argument and that just calls 
waveiD_u0.viz: 
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def viz( 


Ty Vi fc, by dt, Cyl, F PDE parameters 

umin, umax, # Interval for u in plots 
animate=True, # Simulation with animation? 
tool=’matplotlib’, # ’matplotlib’ or ’scitools’ 
solver_function=solver, # Function with numerical algorithm 
version=’vectorized’ , # ?scalar’ or ’vectorized’ 

yg 

import waveiD_u0 

if version == ’vectorized’: 


# Reuse viz from wave1D_u0, but with the present 

# modules’ new vectorized solver (which has 

# version=’vectorized’ as default argument; 

# waveiD_u0.viz does not feature this argument) 

cpu = wave1D_u0.viz( 
LL; Vy 2. eb. dt. C, 7, punin ones, 
animate, tool, solver_function=solver) 

elif version == ’scalar’: 

# Call waveiD_uO.viz with a solver with 

# scalar code and use wave1D_u0.solver. 

cpu = wave1D_u0.viz( 
i, Vy 2, Ga Lede, Cy 7, unin, unex, 
animate, tool, 
solver_function=wave1iD_u0.solver) 


Solution 2 There is a more advanced and fancier solution featuring a very useful 
trick: we can make a new function that will always call wave1D_u0v.solver with 
version=’scalar’. The functools.partial function from standard Python 
takes a function func as argument and a series of positional and keyword arguments 
and returns a new function that will call func with the supplied arguments, while 
the user can control all the other arguments in func. Consider a trivial example, 


def f(a, b, c=2): 
return a -DC 


We want to ensure that f is always called with c=3, i.e., f has only two “free” 
arguments a and b. This functionality is obtained by 


import functools 
f2 = functools.partial(f, c=3) 


print f2(1, 2) # results in 1+2+3=6 


Now f2 calls f with whatever the user supplies as a and b, but c is always 3. 
Back to our viz code, we can do 


import functools 
# Call wave1D_u0.solver with version fixed to scalar 
scalar_solver = functools.partial(wave1D_u0.solver, version=’scalar’) 
cpu = wave1D_u0.viz( 
Ix Wy Gly Gy log dt (Gy I, umin, unax, 
animate, tool, solver_function=scalar_solver) 
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The new scalar_solver takes the same arguments as wave1D_u0.scalar and 
calls wave1D_uOv.scalar, but always supplies the extra argument version= 
>scalar’. When sending this solver_function to wave1D_u0.viz, the latter 
will call wave1D_u0v.solver with all the I, V, f, etc., arguments we supply, plus 
version=’scalar’. 


Efficiency experiments We now have a viz function that can call our solver 
function both in scalar and vectorized mode. The function run_efficiency_ 
experiments in wave1D_u0Ov.py performs a set of experiments and reports the 
CPU time spent in the scalar and vectorized solver for the previous string vibration 
example with spatial mesh resolutions Nx = 50, 100, 200, 400, 800. Running this 
function reveals that the vectorized code runs substantially faster: the vectorized 
code runs approximately NV, /10 times as fast as the scalar code! 


2.4.5 Remark on the Updating of Arrays 


At the end of each time step we need to update the u_nm1 and u_n arrays such that 
they have the right content for the next time step: 


The order here is important: updating u_n first, makes u_nm1 equal to u, which is 
wrong! 

The assignment u_n[:] = u copies the content of the u array into the elements 
of the u_n array. Such copying takes time, but that time is negligible compared to 
the time needed for computing u from the finite difference formula, even when the 
formula has a vectorized implementation. However, efficiency of program code is a 
key topic when solving PDEs numerically (particularly when there are two or three 
space dimensions), so it must be mentioned that there exists a much more efficient 
way of making the arrays u_nm1 and u_n ready for the next time step. The idea is 
based on switching references and explained as follows. 

A Python variable is actually a reference to some object (C programmers may 
think of pointers). Instead of copying data, we can let u_nm1 refer to the u_n object 
and u_n refer to the u object. This is a very efficient operation (like switching 
pointers in C). A naive implementation like 


u_nmi = u_n 
un=u 


will fail, however, because now u_nm1 refers to the u_n object, but then the name 
u_n refers to u, so that this u object has two references, u_n and u, while our 
third array, originally referred to by u_nm1, has no more references and is lost. 
This means that the variables u, u_n, and u_nm1 refer to two arrays and not three. 
Consequently, the computations at the next time level will be messed up, since 
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updating the elements in u will imply updating the elements in u_n too, thereby 
destroying the solution at the previous time step. 

While u_nmi = u_n is fine, u_n = u is problematic, so the solution to this 
problem is to ensure that u points to the u_nm1 array. This is mathematically wrong, 
but new correct values will be filled into u at the next time step and make it right. 

The correct switch of references is 


tmp = u_nmi 
u_nmi = u_n 
Wise 
u = tmp 


We can get rid of the temporary reference tmp by writing 


uü nmi, un, uu n, u, u nmi 


This switching of references for updating our arrays will be used in later implemen- 
tations. 


Caution 

The update u_nm1, u_n, u = u_n, u, u_nmi leaves wrong content in u at 
the final time step. This means that if we return u, as we do in the example 
codes here, we actually return u_nm1, which is obviously wrong. It is therefore 
important to adjust the content ofu tou = u_n before returning u. (Note that the 
user_action function reduces the need to return the solution from the solver.) 


2.5 Exercises 


Exercise 2.1: Simulate a standing wave 
The purpose of this exercise is to simulate standing waves on [0, L] and illustrate 
the error in the simulation. Standing waves arise from an initial condition 


u(x,0) = Asin (mx) N 


where m is an integer and A is a freely chosen amplitude. The corresponding exact 
solution can be computed and reads 


Ue(x,t) = Asin (Fmx) cos (mcr) : 


a) Explain that for a function sin kx cos wt the wave length in space is A = 2x/k 
and the period in time is P = 27/w. Use these expressions to find the wave 
length in space and period in time of ue above. 

b) Import the solver function from wave1D_u0. py into a new file where the viz 
function is reimplemented such that it plots either the numerical and the exact 
solution, or the error. 

c) Make animations where you illustrate how the error e} = ue(x;,t,) — u? devel- 
ops and increases in time. Also make animations of u and ue simultaneously. 
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Hint I Quite long time simulations are needed in order to display significant dis- 
crepancies between the numerical and exact solution. 


Hint 2 A possible set of parameters is L 12,m 9c =2, AÁ 1, Ny 80, 
C = 0.8. The error mesh function e” can be simulated for 10 periods, while 
20-30 periods are needed to show significant differences between the curves for the 
numerical and exact solution. 
Filename: wave_standing. 


Remarks The important parameters for numerical quality are C and kAx, where 
C = cAt/Ax is the Courant number and k is defined above (k Ax is proportional 
to how many mesh points we have per wave length in space, see Sect. 2.10.4 for 
explanation). 


Exercise 2.2: Add storage of solution in a user action function 

Extend the plot_u function in the file wave1D_u0.py to also store the solutions 
u in a list. To this end, declare all_u as an empty list in the viz function, out- 
side plot_u, and perform an append operation inside the plot_u function. Note 
that a function, like plot_u, inside another function, like viz, remembers all 
local variables in viz function, including all_u, even when plot_u is called 
(as user_action) in the solver function. Test both all_u.append(u) and 
all_u.append(u.copy()). Why does one of these constructions fail to store 
the solution correctly? Let the viz function return the all_u list converted to a 
two-dimensional numpy array. 

Filename: wave1D_u0_s_store. 


Exercise 2.3: Use a class for the user action function 

Redo Exercise 2.2 using a class for the user action function. Let the al1l_u list be 
an attribute in this class and implement the user action function as a method (the 
special method __call__ is a natural choice). The class versions avoid that the 
user action function depends on parameters defined outside the function (such as 
all_u in Exercise 2.2). 

Filename: wave1D_u0_s2c. 


Exercise 2.4: Compare several Courant numbers in one movie 

The goal of this exercise is to make movies where several curves, correspond- 
ing to different Courant numbers, are visualized. Write a program that resembles 
wave1D_u0_s2c.py in Exercise 2.3, but with a viz function that can take a list of 
C values as argument and create a movie with solutions corresponding to the given 
C values. The plot_u function must be changed to store the solution in an array 
(see Exercise 2.2 or 2.3 for details), solver must be computed for each value of 
the Courant number, and finally one must run through each time step and plot all 
the spatial solution curves in one figure and store it in a file. 

The challenge in such a visualization is to ensure that the curves in one plot 
correspond to the same time point. The easiest remedy is to keep the time resolution 
constant and change the space resolution to change the Courant number. Note that 
each spatial grid is needed for the final plotting, so it is an option to store those grids 
too. 

Filename: wave_numerics_comparison. 
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Exercise 2.5: Implementing the solver function as a generator 

The callback function user_action(u, x, t, n) is called from the solver 
function (in, e.g., wave1D_u0O. py) at every time level and lets the user work per- 
form desired actions with the solution, like plotting it on the screen. We have 
implemented the callback function in the typical way it would have been done in C 
and Fortran. Specifically, the code looks like 


if user_action is not None: 
ift user action, z, C. n): 
break 


Many Python programmers, however, may claim that solver is an iterative pro- 
cess, and that iterative processes with callbacks to the user code is more elegantly 
implemented as generators. The rest of the text has little meaning unless you are 
familiar with Python generators and the yield statement. 

Instead of calling user_action, the solver function issues a yield statement, 
which is a kind of return statement: 


yield u, x, t, n 


The program control is directed back to the calling code: 


foru, Kat n in solver... ): 
# Do something with u at t[n] 


When the block is done, solver continues with the statement after yield. Note 
that the functionality of terminating the solution process if user_action returns a 
True value is not possible to implement in the generator case. 

Implement the solver function as a generator, and plot the solution at each time 
step. 
Filename: wave1D_u0_generator. 


Project 2.6: Calculus with 1D mesh functions 

This project explores integration and differentiation of mesh functions, both with 
scalar and vectorized implementations. We are given a mesh function f; on a spatial 
one-dimensional mesh x; = i Ax, i = 0,..., Ny, over the interval [a, b]. 


a) Define the discrete derivative of f; by using centered differences at internal 
mesh points and one-sided differences at the end points. Implement a scalar 
version of the computation in a Python function and write an associated unit test 
for the linear case f(x) = 4x — 2.5 where the discrete derivative should be 
exact. 

b) Vectorize the implementation of the discrete derivative. Extend the unit test to 
check the validity of the implementation. 

c) To compute the discrete integral F; of f;, we assume that the mesh function f; 
varies linearly between the mesh points. Let f(x) be such a linear interpolant 
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of fi. We then have 


F; = f toons. 


The exact integral of a piecewise linear function f(x) is given by the Trape- 
zoidal rule. Show that if F; is already computed, we can find F;,, from 


1 
Fin, = F; + adi Jig DA 


Make a function for the scalar implementation of the discrete integral as a mesh 
function. That is, the function should return F; fori = 0,..., N,. For a unit test 
one can use the fact that the above defined discrete integral of a linear function 
(say f(x) = 4x — 2.5) is exact. 

d) Vectorize the implementation of the discrete integral. Extend the unit test to 
check the validity of the implementation. 


Hint Interpret the recursive formula for F;,; as a sum. Make an array with each 
element of the sum and use the "cumsum" (numpy. cumsum) operation to compute 
the accumulative sum: numpy.cumsum([1,3,5]) is [1,4,9]. 


e) Create a class MeshCalculus that can integrate and differentiate mesh func- 
tions. The class can just define some methods that call the previously imple- 
mented Python functions. Here is an example on the usage: 


import numpy as np 
calc = MeshCalculus(vectorized=True) 


x = np.linspace(0, 1, 11) # mesh 

f = np.exp(x) # mesh function 

df = calc.differentiate(f, x) # discrete derivative 

F = calc.integrate(f, x) # discrete anti-derivative 


Filename: mesh_calculus_1D. 
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The boundary condition u = 0 in a wave equation reflects the wave, but u changes 
sign at the boundary, while the condition ux = 0 reflects the wave as a mirror and 
preserves the sign, see a web page? or a movie file for demonstration. 

Our next task is to explain how to implement the boundary condition ux = 0, 
which is more complicated to express numerically and also to implement than a 
given value of u. We shall present two methods for implementing ux = 0 ina finite 
difference scheme, one based on deriving a modified stencil at the boundary, and 
another one based on extending the mesh with ghost cells and ghost points. 


5 http://tinyurl.com/hbcasmj/book/html/mov-wave/demo_BC_gaussian/index.html 
6 http://tinyurl.com/gokgkov/mov- wave/demo_BC_gaussian/movie.flv 
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2.6.1 Neumann Boundary Condition 


When a wave hits a boundary and is to be reflected back, one applies the condition 
—=n-Vu=0. (2.35) 


The derivative d/dn is in the outward normal direction from a general boundary. 
For a 1D domain [0, L], we have that 


ð 
on 


3 


y=L Ox 


a 


x=L ðn 


Boundary condition terminology 

Boundary conditions that specify the value of ðu/ðn (or shorter u,) are known 
as Neumann’ conditions, while Dirichlet conditions? refer to specifications of u. 
When the values are zero (du/dn = 0 or u = 0) we speak about homogeneous 
Neumann or Dirichlet conditions. 


2.6.2 Discretization of Derivatives at the Boundary 


How can we incorporate the condition (2.35) in the finite difference scheme? Since 
we have used central differences in all the other approximations to derivatives in the 
scheme, it is tempting to implement (2.35) at x = 0 and t = t, by the difference 
u”, — u” 

Daul = —— =0. 2.36 

[ 2) ulo Ax ( ) 
The problem is that u”, is not a u value that is being computed since the point is 
outside the mesh. However, if we combine (2.36) with the scheme 

utt! = u?! + 2u? + C? (ut, — 2u? +u"_,), (2.37) 

for i = 0, we can eliminate the fictitious value u”. We see that u”; = u from 
(2.36), which can be used in (2.37) to arrive at a modified scheme for the boundary 


: n+l, 
point uy": 


utt! = u?! + 2u? +2C7? (ut, —u}), i =0. (2.38) 
Figure 2.4 visualizes this equation for computing ue in terms of as uł, and ut. 
Similarly, (2.35) applied at x = L is discretized by a central difference 


n n 
Un.+41 T YUN,-1 


=0. 2.39 
2Ax ( ) 


7 http://en.wikipedia.org/wiki/Neumann_boundary_condition 
8 http://en. wikipedia.org/wiki/Dirichlet_conditions 
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Stencil at boundary point 
5 $ T T T 


index n 


index i 


Fig.2.4 Modified stencil at a boundary with a Neumann condition 


Combined with the scheme for i = N, we get a modified scheme for the boundary 


n+l, 
value uN, : 


u}! = uy?! + 2u? +20? (utut), i= Ny. (2.40) 


The modification of the scheme at the boundary is also required for the special 
formula for the first time step. How the stencil moves through the mesh and is 
modified at the boundary can be illustrated by an animation in a web page? or a 
movie file!®. 


2.6.3 Implementation of Neumann Conditions 


We have seen in the preceding section that the special formulas for the boundary 
points arise from replacing u/_, by u/,, when computing ue from the stencil 
formula for i = 0. Similarly, we replace u, by u?_, in the stencil formula for 
i = N,. This observation can conveniently be used in the coding: we just work 
with the general stencil formula, but write the code such that it is easy to replace 
uli-1] by u[i+1] and vice versa. This is achieved by having the indices i+1 and 
i-1 as variables ip1 (i plus 1) and im1 (i minus 1), respectively. At the boundary 
we can easily define im1=i+1 while we use im1=i-1 in the internal parts of the 
mesh. Here are the details of the implementation (note that the updating formula 


for u[i] is the general stencil formula): 


? http://tinyurl.com/hbcasmj/book/html/mov-wave/N_stencil_gpl/index.html 
10 http://tinyurl.com/gokgkov/mov-wave/N_stencil_gpl/movie.ogg 
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i=0 
ipl = it+1 
imi = ipl # i-1 -> i+1 


uli] = u_n[i] + C2*(u_n[im1] - 2*u_n[i] + u_n[ip1]) 


i = Nx 

im1 = i-1 

ipl = imi # il => a 

ufi] = u_n[i] + C2*(u_n[im1] - 2*u_n[i] + u_n[ip1]) 


We can in fact create one loop over both the internal and boundary points and 
use only one updating formula: 


for i in range(0, Nx+1): 
ipl = iti if i < Nx else i-1 
imi = i-1 if i>O else iti 
uli] = u_n[i] + C2*(u_n[im1] - 2*u_n[i] + u_n[ip1]) 


The program waveiD_n0O.py contains a complete implementation of the 1D 
wave equation with boundary conditions ux = 0 atx = 0 and x = L. 

It would be nice to modify the test_quadratic test case from the 
wave1D_u0.py with Dirichlet conditions, described in Sect. 2.4.3. However, 
the Neumann conditions require the polynomial variation in the x direction to be of 
third degree, which causes challenging problems when designing a test where the 
numerical solution is known exactly. Exercise 2.15 outlines ideas and code for this 
purpose. The only test in wave1D_n0.py is to start with a plug wave at rest and 
see that the initial condition is reached again perfectly after one period of motion, 
but such a test requires C = 1 (so the numerical solution coincides with the exact 
solution of the PDE, see Sect. 2.10.4). 


2.6.4 Index Set Notation 


To improve our mathematical writing and our implementations, it is wise to intro- 
duce a special notation for index sets. This means that we write x;, followed by i € 
Jx, instead of i = 0,..., Ny. Obviously, 7, must be the index set 7, = {0,..., Nx}, 
but it is often advantageous to have a symbol for this set rather than specifying all its 
elements (all the time, as we have done up to now). This new notation saves writing 
and makes specifications of algorithms and their implementation as computer code 
simpler. 

The first index in the set will be denoted 7? and the last 77!. When we 
need to skip the first element of the set, we use 77 for the remaining subset 


I} = {1,...,NM,}. Similarly, if the last element is to be dropped, we write 
T = {0,..., Ny — 1} for the remaining indices. All the indices corresponding to 
inner grid points are specified by 7 = {1,..., Ny — 1}. For the time domain we 


find it natural to explicitly use 0 as the first index, so we will usually write n = 0 
and tọ rather than n = a We also avoid notation like Xg- and will instead use x;, 
i=, 
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The Python code associated with index sets applies the following conventions: 


Notation Python 

ae Ix 

aft Ix[0] 
ial Ix[-1] 
Sf Ix[:-1] 
dhe iks] 
aE Ie Lg = al] 


Why index sets are useful 

An important feature of the index set notation is that it keeps our formulas and 
code independent of how we count mesh points. For example, the notation? € Jy 
ori = 7° remains the same whether 7, is defined as above or as starting at 1, 
i.e., J. = {1,...,Q}. Similarly, we can in the code define Ix=range (Nx+1) 
or Ix=range(1,Q), and expressions like Ix [0] and Ix[1:-1] remain correct. 
One application where the index set notation is convenient is conversion of code 
from a language where arrays has base index 0 (e.g., Python and C) to languages 
where the base index is 1 (e.g., MATLAB and Fortran). Another important ap- 
plication is implementation of Neumann conditions via ghost points (see next 
section). 


For the current problem setting in the x, ¢ plane, we work with the index sets 
Te = {0,..., Nx}, FT = {0,..., Ne}, (2.41) 
defined in Python as 


ix 
It 


range(0, Nx+1) 
range(0, Nt+1) 


A finite difference scheme can with the index set notation be specified as 


2 
ll 


1 
n+l n 2 n n n . i — 
7 u; = 56 (uj = 2u; +u), iet,n=0, 
— n—l n 2 n n n . i i 
upt! = ul! + 2u? + C? (uf, — 2u +u), ict, net, 


u”t! = 0, i =I, neg, 


wt =0, i=K', ne. 
The corresponding implementation becomes 


# Initial condition 
fori in olhei: 
uli] = u_n[i] - 0.5*C2*(u_n[i-1] - 2*u_n[i] + u_n[i+1]) 
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# Time loop 
anh iat aial Me ESNE 
# Compute internal points 
for iin e E 
uli] = - u_nmi[i] + 2*u_n[i] + \ 
C2*(u_n[i-1] - 2*u_n[i] + u_n[i+i]) 
# Compute boundary conditions 
i= 1Ix[0]; ufil (0) 
eaea will] 0 


(ar, 
[i 


Notice 

The program wave1D_dn.py applies the index set notation and solves the 1D 
wave equation Us, = c?uxx + f(x,t) with quite general boundary and initial 
conditions: 


x = 0: u = Uj(t) or ux = 0 
x=L:u=U_(t) oru, =0 
t =0: u = I(x) 

t = 0: u; = V(x) 


The program combines Dirichlet and Neumann conditions, scalar and vectorized 
implementation of schemes, and the index set notation into one piece of code. A 
lot of test examples are also included in the program: 


e A rectangular plug-shaped initial condition. (For C = 1 the solution will be 
a rectangle that jumps one cell per time step, making the case well suited for 
verification.) 

A Gaussian function as initial condition. 

e A triangular profile as initial condition, which resembles the typical initial 
shape of a guitar string. 

A sinusoidal variation of u at x = 0 and either u = 0 or ux = O at x = L. 

e An analytical solution u(x,t) = cos(mmt/L) sin($mzx/L), which can be 
used for convergence rate tests. 


2.6.5 Verifying the Implementation of Neumann Conditions 


How can we test that the Neumann conditions are correctly implemented? The 
solver function in the wave1D_dn.py program described in the box above ac- 
cepts Dirichlet or Neumann conditions at x = 0 and x = L. It is tempting to 
apply a quadratic solution as described in Sect. 2.2.1 and 2.3.3, but it turns out that 
this solution is no longer an exact solution of the discrete equations if a Neumann 
condition is implemented on the boundary. A linear solution does not help since 
we only have homogeneous Neumann conditions in wave1D_dn.py, and we are 
consequently left with testing just a constant solution: u = const. 
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def test_constant(): 
woe 
Check the scalar and vectorized versions for 
a constant u(x,t). We simulate in [0, L] and apply 
Neumann and Dirichlet conditions at both ends. 
"un 
u_const = 0.45 
u_exact = lambda x, t: u_const 
I = lambda x: u_exact(x, 0) 
V = lambda x: 0 
f = lambda x, t: 0 


def assert no error u, x, t, n): 
u_e = u_exact(x, t[n]) 
diff = np.abs(u - u_e).max() 
msg = ’diff=ZE, t_/d=/g’ % (diff, n, tini) 
tol = 1E-13 
assert diff < tol, msg 


for U_O in (None, lambda t: u_const): 
for U_L in (None, lambda t: u_const): 

ily = Pn 
¢ = 1. 

C = 0.75 

Nx = 3 # Very coarse mesh for this exact test 

dt = C*(L/Nx)/c 

T= 18 # long time integration 


5 
5 
U 


solver (I. Wy aig Cy WO, Wilh, Ih, Che, ©, T, 
user_action=assert_no_error, 
version=’scalar’) 

solver I, Wey fo Gy WO, Wb, Ih, Chey Gl, it, 
user_action=assert_no_error, 
version=’vectorized’) 

print U_O, U_L 


The quadratic solution is very useful for testing, but it requires Dirichlet conditions 
at both ends. 

Another test may utilize the fact that the approximation error vanishes when the 
Courant number is unity. We can, for example, start with a plug profile as initial 
condition, let this wave split into two plug waves, one in each direction, and check 
that the two plug waves come back and form the initial condition again after “one 
period” of the solution process. Neumann conditions can be applied at both ends. 
A proper test function reads 


def test_plug(): 
"""Check that an initial plug is correct back after one period.""" 
L = 1.0 
c = 0.5 
dt = (L/10)/c # Nx=10 
I = lambda x: 0 if abs(x-L/2.0) > 0.1 else 1 
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u_s, x, t, cpu = solver( 
I=I, 
V=None, f=None, c=0.5, U_O=None, U_L=None, L=L, 
dt=dt, C=1, T=4, user_action=None, version=’scalar’) 
u_v, x, t, cpu = solver( 
I=I, 
V=None, f=None, c=0.5, U_O=None, U_L=None, L=L, 
dt=dt, C=1, T=4, user_action=None, version=’vectorized’) 
tol = 1E-13 
diff = abs(u_s - u_v).max() 
assert diff < tol 
u_O = np.array([I(x_) for x_ in x]) 
diff = np.abs(u_s - u_O).max() 
assert diff < tol 


Other tests must rely on an unknown approximation error, so effectively we are 
left with tests on the convergence rate. 


2.6.6 Alternative Implementation via Ghost Cells 


Idea Instead of modifying the scheme at the boundary, we can introduce extra 
points outside the domain such that the fictitious values uw", and u'y ,, are de- 
fined in the mesh. Adding the intervals [—Ax,0] and [L, L + Ax], known as 
ghost cells, to the mesh gives us all the needed mesh points, corresponding to 
i = —1,0,...,Ny, Ny + 1. The extra points with i = —l andi = Ny +1 
are known as ghost points, and values at these points, u”, and wy ,,, are called 
ghost values. 
The important idea is to ensure that we always have 


NM. = 2m n se en 
ui; =u] and uy. 4) Z UN- 


because then the application of the standard scheme at a boundary point i = 0 
ori = N, will be correct and guarantee that the solution is compatible with the 
boundary condition uv, = 0. 

Some readers may find it strange to just extend the domain with ghost cells 
as a general technique, because in some problems there is a completely different 
medium with different physics and equations right outside of a boundary. Neverthe- 
less, one should view the ghost cell technique as a purely mathematical technique, 
which is valid in the limit Ax — 0 and helps us to implement derivatives. 


Implementation The u array now needs extra elements corresponding to the ghost 
points. Two new point values are needed: 


u = zeros(Nx+3) 


The arrays u_n and u_nm1 must be defined accordingly. 
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Unfortunately, a major indexing problem arises with ghost cells. The reason is 
that Python indices must start at 0 and u[-1] will always mean the last element 
in u. This fact gives, apparently, a mismatch between the mathematical indices 
i = —1,0,...,N, + 1 and the Python indices running over u: 0,..,Nx+2. One 
remedy is to change the mathematical indexing of i in the scheme and write 

utt! =., i=1,..., el, 
instead of i = 0,..., Ny as we have previously used. The ghost points now corre- 
spond toi = O andi = N, + 1. A better solution is to use the ideas of Sect. 2.6.4: 
we hide the specific index value in an index set and operate with inner and boundary 
points using the index set notation. 

To this end, we define u with proper length and Ix to be the corresponding 
indices for the real physical mesh points (1,2,..., Ny + 1): 


u = zeros (Nx+3) 
Ix = range(1, u.shape[0]-1) 


That is, the boundary points have indices Ix[0] and Ix[-1] (as before). We first 
update the solution at all physical mesh points (i.e., interior points in the mesh): 


for Gl aio ped 
uli] = - u_nmi[i] + 2*u_n[i] + \ 
C2*(u_n[i-1] - 2*u_n[i] + u_n[iti]) 


The indexing becomes a bit more complicated when we call functions like V(x) 
and f (x, t), as we must remember that the appropriate x coordinate is given as 
x[i-Ix[0]]: 


for ian Ir: 
uli] = u_n[i] + dt*V(x[i-Ix[0]]) + \ 
0.5*C2*(u_n[i-1] - 2*u_n[i] + u_n[iti]) + \ 
0.5*dt2*f (x[i-Ix[0]], t[0]) 


It remains to update the solution at ghost points, i.e., u[0] and u[-1] (or 
u[Nx+2]). For a boundary condition ux = 0, the ghost value must equal the value 
at the associated inner mesh point. Computer code makes this statement precise: 


i = Ix(0] # x=0 boundary 
ul[i-1] = u[i+1] 
al, 2 En] # x=L boundary 


uli+1] = ufi-1] 


The physical solution to be plotted is now inu[1:-1], or equivalently u [Ix [0] : 
Ix[-1]+1], so this slice is the quantity to be returned from a solver function. 
A complete implementation appears in the program wave1D_n0_ghost.py. 
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Warning 
We have to be careful with how the spatial and temporal mesh points are stored. 
Say we let x be the physical mesh points, 


x = linspace(O, L, Nx+1) 


“Standard coding” of the initial condition, 


fom ai het Ix: 
unl = TE 


becomes wrong, since u_n and x have different lengths and the index i cor- 
responds to two different mesh points. In fact, x[i] corresponds to u[1+i]. 
A correct implementation is 


atopy si aho eag 
u_n[i] = I(x[i-Ix[0]]) 


Similarly, a source term usually coded as f (x[i], t[n]) is incorrect if x is 
defined to be the physical points, so x [i] must be replaced by x [i-Ix[0]]. 

An alternative remedy is to let x also cover the ghost points such that u [i] is 
the value at x [i]. 


The ghost cell is only added to the boundary where we have a Neumann condi- 
tion. Suppose we have a Dirichlet condition at x = L and a homogeneous Neumann 
condition at x = 0. One ghost cell [—Ax, 0] is added to the mesh, so the index set 
for the physical points becomes {1,..., Ny + 1}. A relevant implementation is 


u = zeros (Nx+2) 
Ix = range(1, u.shape[0]) 


iepe Sl sha ise||e—al]| 2 
uli] = - u_nmi[i] + 2*u_n[i] + \ 
C2*(u_n[i-i] - 2*u_n[i] + u_n[iti]) + \ 
dt2*f(x[i-Ix[0]], t[n]) 


i = Ix[-1] 
uli] = U_O # set Dirichlet value 
i = Ix(0] 


ufi-1] = u[i+1] # update ghost value 


The physical solution to be plotted is now in u[1:] or (as always) u[Ix[0]: 
Ix[-1]+1]. 
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2.7 Generalization: Variable Wave Velocity 


Our next generalization of the 1D wave equation (2.1) or (2.17) is to allow for 
a variable wave velocity c: c = c(x), usually motivated by wave motion in a 
domain composed of different physical media. When the media differ in physical 
properties like density or porosity, the wave velocity c is affected and will depend 
on the position in space. Figure 2.5 shows a wave propagating in one medium 
[0, 0.7] U [0.9, 1] with wave velocity cı (left) before it enters a second medium 
(0.7, 0.9) with wave velocity cz (right). When the wave meets the boundary where 
c jumps from c; to c2, a part of the wave is reflected back into the first medium 
(the reflected wave), while one part is transmitted through the second medium (the 
transmitted wave). 


2.7.1 The Model PDE with a Variable Coefficient 


Instead of working with the squared quantity c?(x), we shall for notational conve- 
nience introduce g(x) = c?(x). A 1D wave equation with variable wave velocity 
often takes the form 


ð ðu 
TA T (a =| + f(x,t). (2.42) 


This is the most frequent form of a wave equation with variable wave velocity, but 
other forms also appear, see Sect. 2.14.1 and equation (2.125). 
As usual, we sample (2.42) at a mesh point, 


2 


ð 0 0 
aa i tn) a (aaniu) + f (xi, tn), 


where the only new term to discretize is 


ð ð 0 ðu \ T” 
ax (aa uann) = E CA f 


15 Nx=80, t=0.375000 1s, Nx=80, t=1.250000 
1 - + 
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Fig.2.5 Left: wave entering another medium; right: transmitted and reflected wave 
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2.7.2 Discretizing the Variable Coefficient 
The principal idea is to first discretize the outer derivative. Define 
ou 
ọ = q(x) a 
% 
and use a centered derivative around x = x; for the derivative of ¢: 


ag |" 7 Pi +4 — $3 _ P 
[2] va Ax = Peel 


i 


Then discretize 


au)! hy — uf 
i+} = Gis} Hi Sdi y = Dally 
E 
Similarly, 
aul" u? — u! 7 
$ip 5i [axl SGA a T Plg 
=T 


These intermediate results are now combined to 


ð du 1 n n n n 
E (a5) | ee) (Caw (uj, — u7) - qi} (u; — ui 4)) . (2.43) 


With operator notation we can write the discretization as 
a du " ax n 
a. q(x)— x [D.(@ D,u)j; : (2.44) 
Ox ax } |; 


Do not use the chain rule on the spatial derivative term! 
Many are tempted to use the chain rule on the term ta (q (x) au), but this is not a 
. . . o X x 
good idea when discretizing such a term. 
The term with a variable coefficient expresses the net flux qu, into a small 
volume (i.e., interval in 1D): 


0 0 1 
5 (a =| ~ —(g(x + Ax)uy(x + Ax) — q(x)us(x)). 
x x Ax 


Our discretization reflects this principle directly: qux at the right end of the 
cell minus qu, at the left end, because this follows from the formula (2.43) or 
[Dx (qDxu)]. 
When using the chain rule, we get two terms quy, + 9,U x. The typical dis- 
cretization is 
[Dx4Dxu + Doxq4Dzxu];, (2.45) 


Writing this out shows that it is different from [D,(qD,u)]! and lacks the phys- 
ical interpretation of net flux into a cell. With a smooth and slowly varying q(x) 
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the differences between the two discretizations are not substantial. However, 
when q exhibits (potentially large) jumps, [D,(q¢D,u)]! with harmonic aver- 
aging of q yields a better solution than arithmetic averaging or (2.45). In the 
literature, the discretization [D,(qD,.u)]? totally dominates and very few men- 
tion the alternative in (2.45). 


2.7.3 Computing the Coefficient Between Mesh Points 


If q is a known function of x, we can easily evaluate q; 44 simply as q(x; 1) with 


Xil = Xi + 5 Ax. However, in many cases c, and hence q, is only known as 
a discrete function, often at the mesh points x;. Evaluating q between two mesh 
points x; and x;,, must then be done by interpolation techniques, of which three 


are of particular interest in this context: 


1 
G15] (qi + dia) = [F]; (arithmetic mean) (2.46) 
qi, © 2 (+ =a (harmonic mean) (2.47) 
5 qi+1 
Giant © (qiqi+1) i (geometric mean) (2.48) 


2 


The arithmetic mean in (2.46) is by far the most commonly used averaging tech- 
nique and is well suited for smooth g(x) functions. The harmonic mean is often 
preferred when q(x) exhibits large jumps (which is typical for geological media). 
The geometric mean is less used, but popular in discretizations to linearize quadratic 
nonlinearities (see Sect. 1.10.2 for an example). 

With the operator notation from (2.46) we can specify the discretization of the 
complete variable-coefficient wave equation in a compact way: 


[D,D,u = D7 Dyu + f]}. (2.49) 


Strictly speaking, [Dxq* D,u]? = [D.(¢* Dxu)]}. 

From the compact difference notation we immediately see what kind of differ- 
ences that each term is approximated with. The notation g* also specifies that the 
variable coefficient is approximated by an arithmetic mean, the definition being 


la]i = (qi + qi+1)/2. 


Before implementing, it remains to solve (2.49) with respect to utt! 


oe Sy = at a 2u? 


At 1 n n 1 n n 
Ez ae 5 4 + qizi) ur) - 5 4 + qi-1)(uj — uj_1) 


+ At? f”. 
(2.50) 
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2.7.4 How a Variable Coefficient Affects the Stability 


The stability criterion derived later (Sect. 2.10.3) reads At < Ax/c. If c = c(x), 
the criterion will depend on the spatial location. We must therefore choose a At 
that is small enough such that no mesh cell has At > Ax/c(x). That is, we must 
use the largest c value in the criterion: 


A - 
Ny, —>. (2.51) 
max, <{0,1] C(x) 


The parameter £ is included as a safety factor: in some problems with a significantly 
varying c it turns out that one must choose f < 1 to have stable solutions (6 = 0.9 
may act as an all-round value). 

A different strategy to handle the stability criterion with variable wave velocity 
is to use a spatially varying At. While the idea is mathematically attractive at 
first sight, the implementation quickly becomes very complicated, so we stick to a 
constant At and a worst case value of c(x) (with a safety factor p). 


2.7.5 Neumann Condition and a Variable Coefficient 


Consider a Neumann condition du/dx = 0 at x = L = N,Ax, discretized as 


u” i- u” i 
y in 
Dal =, 8 S 
fori = N,. Using the scheme (2.50) at the end point i = N, with uï, = uf 
results in 
yt = =u! Æ 2u” 


t 


At 5 n n n A m 
7 (=) (a0 =u) qii (Ue = uja )+ Ar f, (2.52) 


_ n At . n n n 
— u}! + 2u} + (=) (liyi +41); — U7) + Ary (2.53) 


Q 


AV? 
=u! + Qu? + (a) 2q: U7 = u) + APS. Gon 
X 


Here we used the approximation 


?q 


x 2qi. (2.55) 
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An alternative derivation may apply the arithmetic mean of q,,_ 1 and qp} 1 in 
(2.53), leading to the term Í E 


1 
(a at zai + a=) i= uj). 


Since 5(qi41 +qi-1) = qi + O(Ax?), we can approximate with 2q; (u"_, — u?) for 
i = N, and get the same term as we did above. 

A common technique when implementing du/dx = 0 boundary conditions, is 
to assume dq/dx = 0 as well. This implies gi+1 = gi-1 and qi+1/2 = i-1/2 for 
i = N,. The implications for the scheme are 


oe = a + 2u” 
AV? 
PANI (2.56) 
n—l n At : n n 2n 
Se Ng) ees on 


2.7.6 Implementation of Variable Coefficients 


The implementation of the scheme with a variable wave velocity g(x) = c?(x) may 
assume that q is available as an array q [i] at the spatial mesh points. The following 
loop is a straightforward implementation of the scheme (2.50): 


for i in range(1, Nx): 
uli] = - u_nmi[i] + 2*u_n[i] + \ 
C2*(0.5*(q[i] + q[i+1])*(u_n[i+1] - u_n[i]) - \ 
0.5*(q[i] + qli-1])*(u_n[i] - u_n[i-1])) + \ 
dt2*f (x[i], t[n]) 


The coefficient C2 is now defined as (dt/dx)**2, i.e., not as the squared Courant 
number, since the wave velocity is variable and appears inside the parenthesis. 

With Neumann conditions u, = 0 at the boundary, we need to combine this 
scheme with the discrete version of the boundary condition, as shown in Sect. 2.7.5. 
Nevertheless, it would be convenient to reuse the formula for the interior points and 
just modify the indices ip1=i+1 and im1=i-1 as we did in Sect. 2.6.3. Assuming 
dq/dx = 0 at the boundaries, we can implement the scheme at the boundary with 
the following code. 


i=0 

ipi = i+1 

imi = ipi 

uli] = - u_nmi[i] + 2*u_n[i] + \ 


C2*(0.5*(q[i] + qlip1])*(u_n[ipi] - u_n[i]) - \ 
0.5*(q[li] + qf{imi])*(u_n[i] - u_n[imi])) + \ 
dt2*f(x[i], t[n]) 
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With ghost cells we can just reuse the formula for the interior points also at the 
boundary, provided that the ghost values of both u and q are correctly updated to 
ensure u, = 0 and gq, = 0. 

A vectorized version of the scheme with a variable coefficient at internal mesh 
points becomes 


uli:-1] = - u_nmi(1:-1] + 2*u_n[1:-1] + \ 
C2*(0.5*(q[1:-1] + q[2:])*(u_n[2:] - u_n[1:-1]) - 
OS => eA G a = A se N 
dt2*f(x[1:-1], t[n]) 


2.7.7 A More General PDE Model with Variable Coefficients 


Sometimes a wave PDE has a variable coefficient in front of the time-derivative 
term: 

Ou 
ot? 


One example appears when modeling elastic waves in a rod with varying density, 
cf. (2.14.1) with e(x). 
A natural scheme for (2.58) is 


0 0 
OWS = a (a) =| + f(x,t). (2.58) 


(oD, D,u = Dyg“ Dyu + fl). (2.59) 


We realize that the @ coefficient poses no particular difficulty, since @ enters the 
formula just as a simple factor in front of a derivative. There is hence no need for 
any averaging of 9. Often, ọ will be moved to the right-hand side, also without any 
difficulty: 

[D;D.u = 0 'D,q* Dyu + fY. (2.60) 


2.7.8 Generalization: Damping 


Waves die out by two mechanisms. In 2D and 3D the energy of the wave spreads 
out in space, and energy conservation then requires the amplitude to decrease. This 
effect is not present in 1D. Damping is another cause of amplitude reduction. For 
example, the vibrations of a string die out because of damping due to air resistance 
and non-elastic effects in the string. 

The simplest way of including damping is to add a first-order derivative to the 
equation (in the same way as friction forces enter a vibrating mechanical system): 


u ðu Pu 
= t 2.61 
3t2 too c za t SO, ), ( 6 ) 


where b > 0 is a prescribed damping coefficient. 
A typical discretization of (2.61) in terms of centered differences reads 


[D;D,u + bDyu = ° D, Dyu + fI. (2.62) 
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n+l 
i 


1 me oF a 
utt! = (1 4 35^) (es = 1) u?! + 2u? 


+C? (u, — 2u} +u?) + aes (2.63) 


Writing out the equation and solving for the unknown uw; *~ gives the scheme 


fori € Jl andn > 1. New equations must be derived for ul, and for boundary 
points in case of Neumann conditions. 

The damping is very small in many wave phenomena and thus only evident for 
very long time simulations. This makes the standard wave equation without damp- 
ing relevant for a lot of applications. 


2.8 Building a General 1D Wave Equation Solver 


The program wave1D_dn_vc.py is a fairly general code for 1D wave propagation 
problems that targets the following initial-boundary value problem 


Ure = (c7(x)ux)x + f(x, t), x€(0,L),t€(0,T] (2.64) 
u(x,0) = I(x), x € [0, L] (2.65) 
u;(x,0) = V(t), x € [0, L] (2.66) 
u(0,t) = Uo(t) or ux(0,t)= 0, t e (0,T] (2.67) 
u(L,t)=UL(t) or u,(L,t) =0, t e (0,T]. (2.68) 


The only new feature here is the time-dependent Dirichlet conditions, but they 
are trivial to implement: 


i = Ix[0] # x=0 
uli] = U_0(t[n+1]) 


i = Ix[-1] # x=L 
uli] = U_L(tIn+1]) 


The solver function is a natural extension of the simplest solver function in 
the initial wave1D_u0.py program, extended with Neumann boundary conditions 
(ux = 0), time-varying Dirichlet conditions, as well as a variable wave velocity. 
The different code segments needed to make these extensions have been shown and 
commented upon in the preceding text. We refer to the solver function in the 
waveiD_dn_vc.py file for all the details. Note in that solver function, however, 
that the technique of “hashing” is used to check whether a certain simulation has 
been run before, or not. This technique is further explained in Sect. C.2.3. 

The vectorization is only applied inside the time loop, not for the initial condition 
or the first time steps, since this initial work is negligible for long time simulations 
in 1D problems. 

The following sections explain various more advanced programming techniques 
applied in the general 1D wave equation solver. 
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2.8.1 User Action Function as a Class 


A useful feature in the wavelD_dn_vc.py program is the specification of the 
user_action function as a class. This part of the program may need some motiva- 
tion and explanation. Although the plot_u_st function (and the PlotMatplotlib 
class) in the wave1D_u0.viz function remembers the local variables in the viz 
function, it is a cleaner solution to store the needed variables together with the 
function, which is exactly what a class offers. 


The code A class for flexible plotting, cleaning up files, making movie files, like 
the function wave1D_u0.viz did, can be coded as follows: 


class PlotAndStoreSolution: 
nun 
Class for the user_action function in solver. 
Visualizes the solution only. 


nun 


cenm iaig K 


self, 

casename=’ tmp’, # Prefix in filenames 

umin=-1, umax=1, # Fixed range of y axis 
pause_between_frames=None, # Movie speed 
backend=’matplotlib’, # or ’gnuplot’ or None 
screen_movie=True, # Show movie on screen? 
title=’’, # Extra message in title 
skip_frame=1, # Skip every skip_frame frame 
filename=None) : # Name of file with solutions 


self.casename = casename 

self.yaxis = [umin, umax] 

self .pause = pause_between_frames 

self .backend = backend 

if backend is None: 
# Use native matplotlib 
import matplotlib.pyplot as plt 

elif backend in (’matplotlib’, ’gnuplot’): 
module = ’scitools.easyviz.’ + backend + ’_’ 
exec(’import %s as plt? % module) 

self .plt = plt 

self.screen_movie = screen_movie 

self .title = title 

self.skip_frame = skip_frame 

self .filename = filename 

if filename is not None: 
# Store time points when u is written to file 
self.t = [] 
filenames = glob.glob(’.’ + self.filename + ’*.dat.npz’) 
for filename in filenames: 

os.remove (filename) 


# Clean up old movie frames 
for filename in glob.glob(’frame_*.png’): 
os.remove (filename) 
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def 


ceall (self Wy z t m 

nun 

Callback function user_action, call by solver: 
Store solution, plot on screen and save to file. 
nun 

# Save solution u to a file using numpy.savez 
if self.filename is not None: 


name = ’u%04da? » n # array name 
kwargs = {name: u} 
fname = ’.’ + self.filename + ’_’ + name + ’.dat’ 


np.savez(fname, **kwargs) 


self.t.append(t[n]) # store corresponding time value 


if n == 0: # save x once 
mp.savez(’.’ + self.filename + ’_x.dat’, x=x) 
# Animate 
if n % self.skip_frame != 0: 
return 


title = ’t=%.3f? % t{[n] 
if Jseltetatile: 
title = sellf ‘tatilel +07 7 + tatile 
if self.backend is None: 
# native matplotlib animation 
if n == 
self .plt.ion() 
self.lines = self.plt.plot(x, u, ’r-’) 
self .plt.axis([x[0], x[-1], 
self.yaxis[0], self.yaxis[1]]) 
self.plt.xlabel(’x’) 
self.plt.ylabel(’u’) 
self.plt.title(title) 
self.plt.legend([’t=%.3f’ % t[n]]) 
else: 
# Update new solution 
self .lines[0].set_ydata(u) 
self.plt.legend([’t=/.3f’ % t[nl]]) 
self.plt.draw() 
else: 
# scitools.easyviz animation 
Seiki pilitr plot: u E, 
xlabel=’x’, ylabel=’u’, 
axis=[x[0], x[-1], 
self.yaxis[0], self.yaxis[1]], 
title=title, 
show=self.screen_movie) 
# pause 
if t[n] == 
time.sleep(2) # let initial condition stay 2 s 
else: 
if self.pause is None: 
pause = 0.2 if u.size < 100 else 0 
time.sleep (pause) 


self .plt.savefig(’frame_%04d.png’ % (n)) 
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Dissection Understanding this class requires quite some familiarity with Python 
in general and class programming in particular. The class supports plotting with 
Matplotlib (backend=None) or SciTools (backend=matplotlib or backend= 
gnuplot) for maximum flexibility. 

The constructor shows how we can flexibly import the plotting engine as (typ- 
ically) scitools.easyviz.gnuplot_ or scitools.easyviz.matplotlib_ 
(note the trailing underscore - it is required). With the screen_movie parameter 
we can suppress displaying each movie frame on the screen. Alternatively, for slow 
movies associated with fine meshes, one can set skip_frame=10, causing every 
10 frames to be shown. 

The __call__ method makes PlotAndStoreSolution instances behave like 
functions, so we can just pass an instance, say p, as the user_action argument in 
the solver function, and any call to user_action will be a call to p.__call__. 
The __call__ method plots the solution on the screen, saves the plot to file, and 
stores the solution in a file for later retrieval. 

More details on storing the solution in files appear in Sect. C.2. 


2.8.2 Pulse Propagation in Two Media 


The function pulse in wave1D_dn_vc.py demonstrates wave motion in heteroge- 
neous media where c varies. One can specify an interval where the wave velocity 
is decreased by a factor sLlowness_factor (or increased by making this factor less 
than one). Figure 2.5 shows a typical simulation scenario. 

Four types of initial conditions are available: 


a rectangular pulse (plug), 

a Gaussian function (gaussian), 

a “cosine hat” consisting of one period of the cosine function (cosinehat), 
half a period of a “cosine hat” (half -cosinehat) 


PO ye 


These peak-shaped initial conditions can be placed in the middle (loc=’ center’) 
or at the left end (loc=’ left’) of the domain. With the pulse in the middle, it splits 
in two parts, each with half the initial amplitude, traveling in opposite directions. 
With the pulse at the left end, centered at x = 0, and using the symmetry condition 
du/dx = 0, only a right-going pulse is generated. There is also a left-going pulse, 
but it travels from x = 0 in negative x direction and is not visible in the domain 
[0, L]. 

The pulse function is a flexible tool for playing around with various wave 
shapes and jumps in the wave velocity (i.e., discontinuous media). The code is 
shown to demonstrate how easy it is to reach this flexibility with the building blocks 
we have already developed: 
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def pulse( 
(aly # Maximum Courant number 
Nx=200, # spatial resolution 


animate=True, 

version=’vectorized’, 

T=2, # end time 

loc=’left’, # location of initial condition 
pulse_tp=’gaussian’, # pulse/init.cond. type 
slowness_factor=2, # inverse of wave vel. in right medium 
medium=[0.7, 0.9], # interval for right medium 


skip_frame=1, # skip frames in animations 
sigma=0.05 # width measure of the pulse 
JE 


Various peaked-shaped initial conditions on [0,1]. 

Wave velocity is decreased by the slowness_factor inside 
medium. The loc parameter can be ’center’ or ’left’, 
depending on where the initial pulse is to be located. 
The sigma parameter governs the width of the pulse. 

LELEL 

# Use scaled parameters: L=1 for domain length, c_0=1 

# for wave velocity outside the domain. 


L = 1.0 

c0 = 1.0 

if loc == ’center’: 
xc = L/2 

elif loc == ’left’: 
xc = 0 


if pulse_tp in (’gaussian’,’Gaussian’): 


def I(x): 
return np.exp(-0.5*((x-xc) /sigma) **2) 
elif pulse_tp == ’plug’: 
def I(x): 
return 0 if abs(x-xc) > sigma else 1 
elif pulse_tp == ’cosinehat’: 
def I(x): 
# One period of a cosine 
w=2 


a = w*sigma 
return 0.5*(1 + np.cos(np.pi*(x-xc)/a)) \ 
if xc - a <= x <= xc + a else 0 


elif pulse_tp == *half-cosinehat’: 
def I(x): 
# Half a period of a cosine 
w=4 


a = w*sigma 
return np.cos(np.pi*(x-xc)/a) \ 
if xc - 0.5*a <= x <= xc + 0.5*a else 0 
else: 
raise ValueError(’Wrong pulse_tp="%s"’ % pulse_tp) 


def c(x): 
return c_0/slowness_factor \ 
if medium[0] <= x <= medium[1] else c_0 
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umin=-0.5; umax=1.5*I (xc) 
casename = ’/4s Nx%s_sf%s’ % \ 
(pulse_tp, Nx, slowness_factor) 
action = PlotMediumAndSolution( 
medium, casename=casename, umin=umin, umax=umax, 
skip_frame=skip_frame, screen_movie=animate, 
backend=None, filename=’tmpdata’ ) 


# Choose the stability limit with given Nx, worst case c 
# (lower C will then use this dt, but smaller Nx) 
dt = (L/Nx)/c_0O 
cpu, hashed_input = solver( 
I=I, V=None, f=None, c=c, 
U_0=None, U_L=None, 
L=L, dt=dt, C=C, T=T, 
user_action=action, 
version=version, 
stability_safety_factor=1) 


if cpu > 0: # did we generate new data? 
action.close_file(hashed_input) 
action.make_movie_file() 

print ’cpu (-1 means no new data generated):’, cpu 


def convergence_rates( 
u_exact, 
ey Wr B85 Cy WLOs Wks lbs 
dtO, num_meshes, 
C, T, version=’scalar’, 
stability_safety_factor=1.0): 
nun 
Half the time step and estimate convergence rates for 
for num_meshes simulations. 
nun 
class ComputeError: 
def __init__(self, norm_type): 
self.error = 0 


def call (s6lf, u, z, t, m: 
M Store norm ont the error in self E. ni 
error = np.abs(u - u_exact(x, t[n])).max() 
self.error = max(self.error, error) 


W = (hl 
h = [] # dt, solver adjusts dx such that C=dt*c/dx 
dt = dt0 


for i in range(num_meshes) : 
error_calculator = ComputeError(’Linf’) 
solver V f Cy WOR WK, L Chey (5 ah, 
user_action=error_calculator, 
version=’scalar’, 
stability_safety_factor=1.0) 
E.append(error_calculator.error) 
h. append (dt) 
dt /= 2 # halve the time step for next simulation 
printi ESTE 
print ’h:’, h 
r = [mp.log(E[i] /E[i-1]) /np.log(h[i] /h[i-1]) 
for i in range(1,num_meshes) ] 
return r 
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def test_convrate_sincos(): 
n=m=2 
L = 1.0 
u_exact = lambda x, t: np.cos(m*np.pi/L*t)*np.sin(m*np.pi/L*x) 


r = convergence_rates( 
u_exact=u_exact, 
I=lambda x: u_exact(x, 0), 


num_meshes=6, 
Cc=0.9, 
T=1, 
version=’scalar’, 
stability_safety_factor=1.0) 
print ’rates sin(x)*cos(t) solution:’, \ 
[round(r_,2) for r_ in r] 
assert abs(r[-1] - 2) < 0.002 


The PlotMediumAndSolution class used here is a subclass of PlotAndStore 
Solution where the medium with reduced c value, as specified by the medium 
interval, is visualized in the plots. 


Comment on the choices of discretization parameters 

The argument N, in the pulse function does not correspond to the actual spatial 
resolution of C < 1, since the solver function takes a fixed At and C, and 
adjusts Ax accordingly. As seen in the pulse function, the specified Aż is 
chosen according to the limit C = 1, soif C < 1, At remains the same, but the 
solver function operates with a larger Ax and smaller N, than was specified 
in the call to pulse. The practical reason is that we always want to keep At 
fixed such that plot frames and movies are synchronized in time regardless of 
the value of C (i.e., Ax is varied when the Courant number varies). 


The reader is encouraged to play around with the pulse function: 


>>> import waveiD_dn_vc as w 
>>> w.pulse(Nx=50, loc=’left’, pulse_tp=’cosinehat’, slowness_factor=2) 


To easily kill the graphics by Ctrl-C and restart a new simulation it might be easier 
to run the above two statements from the command line with 


Terminal 


Terminal> python -c ’import waveiD_dn_vc as w; w.pulse(...)’ 
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2.9 Exercises 


Exercise 2.7: Find the analytical solution to a damped wave equation 

Consider the wave equation with damping (2.61). The goal is to find an exact 
solution to a wave problem with damping and zero source term. A starting point 
is the standing wave solution from Exercise 2.1. It becomes necessary to include a 
damping term e~*’ and also have both a sine and cosine component in time: 


ue(x,t) =e? sinkx (Acoswt + Bsinat) . 


Find k from the boundary conditions u(0,t) = u(L,t) = 0. Then use the PDE 
to find constraints on f, w, A, and B. Set up a complete initial-boundary value 
problem and its solution. 

Filename: damped_waves. 


Problem 2.8: Explore symmetry boundary conditions 
Consider the simple "plug" wave where 2 = [—L, L] and 


ie 1, xe [-5,4], 
0, otherwise 

for some number 0 < 6 < L. The other initial condition is u,(x,0) = 0 and there 

is no source term f. The boundary conditions can be set to u = 0. The solution 

to this problem is symmetric around x = 0. This means that we can simulate the 

wave process in only half of the domain [0, L]. 


a) Argue why the symmetry boundary condition is ux = 0 at x = 0. 
Hint Symmetry of a function about x = x9 means that f (xọ +h) = f(x — h). 


b) Perform simulations of the complete wave problem on [—L, L]. Thereafter, uti- 
lize the symmetry of the solution and run a simulation in half of the domain 
[0, L], using a boundary condition at x = 0. Compare plots from the two solu- 
tions and confirm that they are the same. 

Prove the symmetry property of the solution by setting up the complete initial- 
boundary value problem and showing that if u(x,t) is a solution, then also 
u(—x, t) is a solution. 

If the code works correctly, the solution u(x,t) = x(L — x)(1 + 5) should be 
reproduced exactly. Write a test function test_quadratic that checks whether 
this is the case. Simulate for x in [0, Z] with a symmetry condition at the end 

È 


X= 7. 


c 


wm 


d 


wm 


Filename: wave1D_symmetric. 


Exercise 2.9: Send pulse waves through a layered medium 
Use the pulse function in wave1D_dn_vc.py to investigate sending a pulse, lo- 
cated with its peak at x = 0, through two media with different wave velocities. The 
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(scaled) velocity in the left medium is 1 while it is 7 in the right medium. Report 
what happens with a Gaussian pulse, a “cosine hat” pulse, half a “cosine hat” pulse, 
and a plug pulse for resolutions Ny = 40, 80, 160, and sy = 2,4. Simulate until 
T=2. 

Filename: pulse1D. 


Exercise 2.10: Explain why numerical noise occurs 

The experiments performed in Exercise 2.9 shows considerable numerical noise in 
the form of non-physical waves, especially for sy = 4 and the plug pulse or the 
half a “cosinehat” pulse. The noise is much less visible for a Gaussian pulse. Run 
the case with the plug and half a “cosinehat” pulse for sy = 1, C = 0.9, 0.25, and 
Nx = 40,80, 160. Use the numerical dispersion relation to explain the observa- 
tions. 

Filename: pulse1D_analysis. 


Exercise 2.11: Investigate harmonic averaging in a 1D model 

Harmonic means are often used if the wave velocity is non-smooth or discontinuous. 
Will harmonic averaging of the wave velocity give less numerical noise for the case 
sf = 4 in Exercise 2.9? 

Filename: pulse1D_harmonic. 


Problem 2.12: Implement open boundary conditions 

To enable a wave to leave the computational domain and travel undisturbed through 
the boundary x = L, one can in a one-dimensional problem impose the following 
condition, called a radiation condition or open boundary condition: 


ae ae (2.69) 
— +c =0. i 
ot Ox 
The parameter c is the wave velocity. 
Show that (2.69) accepts a solution u = gr(x — ct) (right-going wave), but not 
u = gı (x+ct) (left-going wave). This means that (2.69) will allow any right-going 
wave r(x — ct) to pass through the boundary undisturbed. 
A corresponding open boundary condition for a left-going wave through x = 0 
is 
—— =0. (2.70) 


a) A natural idea for discretizing the condition (2.69) at the spatial end point i = 
N, is to apply centered differences in time and space: 


[Dou +cD,.u = 0}, i=Ny. (2.71) 


Eliminate the fictitious value wu’) ,, by using the discrete equation at the same 
point. 

The equation for the first step, u > is in principle also affected, but we can then 
use the condition uy, = 0 since the wave has not yet reached the right boundary. 
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b) A much more convenient implementation of the open boundary condition at 
x = L can be based on an explicit discretization 


[D}u +cDzu = 0}, i= N. (2.72) 
From this equation, one can solve for uy and apply the formula as a Dirichlet 
condition at the boundary point. However, the finite difference approximations 
involved are of first order. 
Implement this scheme for a wave equation us = c7uy, in a domain [0, L], 
where you have u, = 0 at x = 0, the condition (2.69) at x = L, and an initial 
disturbance in the middle of the domain, e.g., a plug profile like 


1, L/2=Le en a L/2 +4, 

u(x,0) = ; 
0, otherwise. 

Observe that the initial wave is split in two, the left-going wave is reflected at 

x = 0, and both waves travel out of x = L, leaving the solution as u = 0 

in [0, L]. Use a unit Courant number such that the numerical solution is exact. 

Make a movie to illustrate what happens. 

Because this simplified implementation of the open boundary condition works, 

there is no need to pursue the more complicated discretization in a). 


Hint Modify the solver function in wave1D_dn.py. 


c) Add the possibility to have either uy = 0 or an open boundary condition at the 
left boundary. The latter condition is discretized as 


[Diu-—cDiu=0];, i=0, (2.73) 
leading to an explicit update of the boundary value u 
The implementation can be tested with a Gaussian function as initial condition: 


1 = (x—m)2 


e 252 
V218 


g(x;m,s) = 


Run two tests: 

(a) Disturbance in the middle of the domain, I(x) = g(x; L/2,s), and open 
boundary condition at the left end. 

(b) Disturbance at the left end, (x) = g(x;0,s), and ux = 0 as symmetry 
boundary condition at this end. 

Make test functions for both cases, testing that the solution is zero after the 

waves have left the domain. 

In 2D and 3D it is difficult to compute the correct wave velocity normal to the 

boundary, which is needed in generalizations of the open boundary conditions 

in higher dimensions. Test the effect of having a slightly wrong wave velocity 

in (2.72). Make movies to illustrate what happens. 


d 


wm 


Filename: wave1D_open_BC. 
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Remarks The condition (2.69) works perfectly in 1D when c is known. In 2D and 
3D, however, the condition reads u; + CyUx + Cyuy = 0, where c, and c, are 
the wave speeds in the x and y directions. Estimating these components (i.e., the 
direction of the wave) is often challenging. Other methods are normally used in 2D 
and 3D to let waves move out of a computational domain. 


Exercise 2.13: Implement periodic boundary conditions 

It is frequently of interest to follow wave motion over large distances and long times. 
A straightforward approach is to work with a very large domain, but that might lead 
to a lot of computations in areas of the domain where the waves cannot be noticed. 
A more efficient approach is to let a right-going wave out of the domain and at the 
same time let it enter the domain on the left. This is called a periodic boundary 
condition. 

The boundary condition at the right end x = L is an open boundary condition 
(see Exercise 2.12) to let a right-going wave out of the domain. At the left end, 
x = 0, we apply, in the beginning of the simulation, either a symmetry boundary 
condition (see Exercise 2.8) ux = 0, or an open boundary condition. 

This initial wave will split in two and either be reflected or transported out of the 
domain at x = 0. The purpose of the exercise is to follow the right-going wave. We 
can do that with a periodic boundary condition. This means that when the right- 
going wave hits the boundary x = L, the open boundary condition lets the wave 
out of the domain, but at the same time we use a boundary condition on the left end 
x = 0 that feeds the outgoing wave into the domain again. This periodic condition 
is simply u(0) = u(L). The switch from ux = 0 or an open boundary condition at 
the left end to a periodic condition can happen when u(L,t) > €, where € = 107+ 
might be an appropriate value for determining when the right-going wave hits the 
boundary x = L. 

The open boundary conditions can conveniently be discretized as explained in 
Exercise 2.12. Implement the described type of boundary conditions and test them 
on two different initial shapes: a plug u(x,0) = 1 for x < 0.1, u(x,0) = 0 
for x > 0.1, and a Gaussian function in the middle of the domain: u(x,0) = 
exp (—4 (x —0.5)*/0.05). The domain is the unit interval [0,1]. Run these two 
shapes for Courant numbers 1 and 0.5. Assume constant wave velocity. Make 
movies of the four cases. Reason why the solutions are correct. 

Filename: periodic. 


Exercise 2.14: Compare discretizations of a Neumann condition 

We have a 1D wave equation with variable wave velocity: Usp = (qux) x. A Neu- 

mann condition u, at x = 0, L can be discretized as shown in (2.54) and (2.57). 
The aim of this exercise is to examine the rate of the numerical error when using 

different ways of discretizing the Neumann condition. 


a) As atest problem, q = 1 + (x — L/2)* can be used, with f (x,t) adapted such 
that the solution has a simple form, say u(x,t) = cos(zx/L)cos(wt) for, e.g., 
@ = 1. Perform numerical experiments and find the convergence rate of the 
error using the approximation (2.54). 

b) Switch to g(x) = 1 + cos(ax/L), which is symmetric at x = 0, L, and check 
the convergence rate of the scheme (2.57). Now, q;-1/2 is a 2nd-order approxi- 
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mation to gj, Gi-1/2 = qi + O.25q Ax? +++, because q} = 0 fori = N, (a 
similar argument can be applied to the casei = 0). 

A third discretization can be based on a simple and convenient, but less accurate, 
one-sided difference: u; — uj_) = Oati = N, and uj; — u; = Oati = 0. 
Derive the resulting scheme in detail and implement it. Run experiments with q 
from a) or b) to establish the rate of convergence of the scheme. 

A fourth technique is to view the scheme as 


c 


wm 


d 


wa 


1 
[Di Dull = —— (KDat, — [aD sul") + UT, 
Ax EED loz 
and place the boundary at x; 1, i = N,, instead of exactly at the physical 
boundary. With this idea of approximating (moving) the boundary, we can just 
set [qDxu]? 1 = 0. Derive the complete scheme using this technique. The 


implementation of the boundary condition at L — Ax/2 is O(Ax?) accurate, 
but the interesting question is what impact the movement of the boundary has 
on the convergence rate. Compute the errors as usual over the entire mesh and 
use g from a) or b). 


Filename: Neumann_discr. 


Exercise 2.15: Verification by a cubic polynomial in space 

The purpose of this exercise is to verify the implementation of the solver func- 
tion in the program waveiD_n0O.py by using an exact numerical solution for the 
wave equation U; = c?u,, + f with Neumann boundary conditions u,(0,t) = 
u,(L,t) = 0. 

A similar verification is used in the file wave1D_u0. py, which solves the same 
PDE, but with Dirichlet boundary conditions u(0,t) = u(L,t) = 0. The idea 
of the verification test in function test_quadratic in wave1D_u0.py is to pro- 
duce a solution that is a lower-order polynomial such that both the PDE problem, 
the boundary conditions, and all the discrete equations are exactly fulfilled. Then 
the solver function should reproduce this exact solution to machine precision. 
More precisely, we seek u = X(x)T(t), with T(t) as a linear function and X(x) 
as a parabola that fulfills the boundary conditions. Inserting this u in the PDE 
determines f. It turns out that u also fulfills the discrete equations, because the 
truncation error of the discretized PDE has derivatives in x and ¢ of order four and 
higher. These derivatives all vanish for a quadratic X(x) and linear T (t). 

It would be attractive to use a similar approach in the case of Neumann condi- 
tions. We set u = X(x)T (t) and seek lower-order polynomials X and T. To force 
ux to vanish at the boundary, we let X, be a parabola. Then X is a cubic polyno- 
mial. The fourth-order derivative of a cubic polynomial vanishes, so u = X(x)T (t) 
will fulfill the discretized PDE also in this case, if f is adjusted such that u fulfills 
the PDE. 

However, the discrete boundary condition is not exactly fulfilled by this choice 
of u. The reason is that 


1 
[Dou]? = ux(Xi, tn) + Gur (is fn) Ax? + O(Ax*) ; (2.74) 
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At the two boundary points, we must demand that the derivative X, (x) = 0 such 
that uy = 0. However, uxxx is a constant and not zero when X(x) is a cubic 
polynomial. Therefore, our u = X(x)T (t) fulfills 


1 
[D2,u]} = grax (Mi n) Ax’, 


and not 
[Do,u]? = 0, i = 0, Nx, 


as it should. (Note that all the higher-order terms O(A.x*) also have higher-order 
derivatives that vanish for a cubic polynomial.) So to summarize, the fundamen- 
tal problem is that u as a product of a cubic polynomial and a linear or quadratic 
polynomial in time is not an exact solution of the discrete boundary conditions. 

To make progress, we assume that u = X(x)T(t), where T for simplicity is 
taken as a prescribed linear function 1 + st, and X(x) is taken as an unknown cubic 
polynomial De aj xÍ. There are two different ways of determining the coeffi- 
cients do,...,d3 such that both the discretized PDE and the discretized boundary 
conditions are fulfilled, under the constraint that we can specify a function f(x, t) 
for the PDE to feed to the solver function in wave1D_n0. py. Both approaches are 
explained in the subexercises. 


a) One can insert u in the discretized PDE and find the corresponding f. Then one 
can insert u in the discretized boundary conditions. This yields two equations 
for the four coefficients ao, . . .,a3. To find the coefficients, one can set dy) = 0 
and a, = 1 for simplicity and then determine az and a3. This approach will 
make a and a3 depend on Ax and f will depend on both Ax and At. 

Use sympy to perform analytical computations. A starting point is to define u 
as follows: 


def test_cubici(): 
import sympy as sm 
Xu e, L dz, dt =- sm-symbols x t c Lidz dt?) 
i, n = sm.symbols(’i n’, integer=True) 


Assume discrete solution is a polynomial of degree 3 in x 
lambda t: 1 + sm.Rational(1,2)*t # Temporal term 
sm.symbols(’a_0 a_1 a_2 a_3’) 

lambda x: sum(a[q]*x**q for q in range(4)) # Spatial term 
lambda x, t: X(x)*T(t) 


(mE r E a 
ou 


The symbolic expression for u is reached by calling u(x,t) with x and t as 
sympy symbols. 

Define DxDx(u, i, n),DtDt(u, i, n),andD2x(u, i, n) as Python func- 
tions for returning the difference approximations [D,D,uJ]?, [D;D;u]}, and 
[D2,u]?. The next step is to set up the residuals for the equations [D2,u]5 = 0 
and [Dou]. = 0, where Ny = L/Ax. Call the residuals R_O and R_L. Sub- 
stitute ao and a; by 0 and 1, respectively, in R_O, R_L, and a: 
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R_O = R_O.subs(al0], 0).subs(al1], 1) 

R_L = R_L.subs(al0], 0).subs(ali], 1) 

a = list(a) # enable in-place assignment 
a[0:2] = 0, 1 


Determining az and a3 from the discretized boundary conditions is then about 
solving two equations with respect to a2 and a3, i.e., a[2:]: 


s = sm.solve([R_0, R_L], a[2:]) 
# s is dictionary with the unknowns a[2] and a[3] as keys 


a[2:] = s[a[2]], s[a[3]] 


Now, a contains computed values and u will automatically use these new values 
since X accesses a. 

Compute the source term f from the discretized PDE: f” = [D,D,u — 
eD, Dyu]. Turn u, the time derivative u, (needed for the initial condi- 
tion V(x)), and f into Python functions. Set numerical values for L, Ny, 
C, and c. Prescribe the time interval as At = CL/(Nxc), which imply 
Ax = cAt/C = L/N,. Define new functions I(x), V(x), and f (x,t) as 
wrappers of the ones made above, where fixed values of L, c, Ax, and At 
are inserted, such that I, V, and f can be passed on to the solver function. 
Finally, call solver with a user_action function that compares the numerical 
solution to this exact solution u of the discrete PDE problem. 


Hint To turn a sympy expression e, depending on a series of symbols, say x, t, 
dx, dt, L, and c, into a plain Python function e_exact (x,t,L,dx,dt,c),one can 
write 


e_exact = sm.lambdify([x,t,L,dx,dt,c], e, ’numpy’) 


The ?numpy’ argument is a good habit as the e_exact function will then work 
with array arguments if it contains mathematical functions (but here we only do 
plain arithmetics, which automatically work with arrays). 


b) An alternative way of determining do,...,a3 is to reason as follows. We first 
construct X(x) such that the boundary conditions are fulfilled: X = x(L — x). 
However, to compensate for the fact that this choice of X does not fulfill the 
discrete boundary condition, we seek u such that 


a 1 
uy = —x(L = x)T(t) = Mee; 
ax 6 


since this u will fit the discrete boundary condition. Assuming u = 
T(t) y 4 ajx}, we can use the above equation to determine the coefficients 
a1,d2,a3. A value, e.g., 1 can be used for do. The following sympy code 
computes this u: 
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def test_cubic2(): 
import sympy as sm 
Xu, Cy) L dx = smusymbolisi@x t c L dz) 
T = lambda t: 1 + sm.Rational(1,2)*t # Temporal term 
Set u as a 3rd-degree polynomial in space 
= lambda x: sum(a[i]*x**i for i in range(4)) 
= sm.symbols(’a_0 a_1 a_2 a_3’) 
= lambda x, t: X(x)*T(t) 
Force discrete boundary condition to be zero by adding 
a correction term the analytical suggestion x*(L-x)*T 
u_x = x*(L-x)*T(t) - 1/6*u_xxx*dx**2 
= sm.diff(u(x,t), x) - ( 
x*(L-x) - sm.Rational(1,6)*sm.diff(u(x,t), x, x, x)*dx**2) 
# R is a polynomial: force all coefficients to vanish. 
# Turn R to Poly to extract coefficients: 
R = sm.poly(R, x) 
coeff = R.all_coeffs() 
s = sm.solve(coeff, a[1:]) # a[0] is not present in R 
# s is dictionary with a[i] as keys 
# Fix al0] as 1 
s[a[0]] = 1 
X = lambda x: sm.simplify(sum(s[ali]]*x**i for i in range(4))) 
u = lambda x, t: X(x)*T(t) 
print NE, ut) 


DtH HS MW MH 


The next step is to find the source term f_e by inserting u_e in the PDE. There- 
after, turn u, f, and the time derivative of u into plain Python functions as in a), 
and then wrap these functions in new functions I, V, and f, with the right signa- 
ture as required by the solver function. Set parameters as in a) and check that 
the solution is exact to machine precision at each time level using an appropriate 
user_action function. 


Filename: waveiD_n0O_test_cubic. 


2.10 Analysis of the Difference Equations 
2.10.1 Properties of the Solution of the Wave Equation 


The wave equation 


Pu 4 0u 
at Ax? 
has solutions of the form 
u(x,t) = gr(x—ct) + g(x + ct), (2.75) 


for any functions gp and g; sufficiently smooth to be differentiated twice. The 
result follows from inserting (2.75) in the wave equation. A function of the form 
&R(x — ct) represents a signal moving to the right in time with constant velocity c. 
This feature can be explained as follows. At time £ = 0 the signal looks like gr(x). 
Introducing a moving horizontal coordinate £ = x — ct, we see the function gr(&) 
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is “at rest” in the £ coordinate system, and the shape is always the same. Say the 
gr(&) function has a peak at € = 0. This peak is located at x = ct, which means 
that it moves with the velocity dx/dt = c in the x coordinate system. Similarly, 
g(x + ct) is a function, initially with shape g; (x), that moves in the negative x 
direction with constant velocity c (introduce £ = x + ct, look at the point £ = 0, 
x = —ct, which has velocity dx/dt = —c). 

With the particular initial conditions 


ð 
u(x,0) = I(x), apt 9) = 0, 
we get, with u as in (2.75), 


gR) + gL) = I(x), —cga(x) + cg) = 0. 


The former suggests gr = gy, and the former then leads to gR = g; = 1/2. 
Consequently, 


u(x,t) = sleet) + sl tet). (2.76) 


The interpretation of (2.76) is that the initial shape of u is split into two parts, each 
with the same shape as / but half of the initial amplitude. One part is traveling to 
the left and the other one to the right. 

The solution has two important physical features: constant amplitude of the left 
and right wave, and constant velocity of these two waves. It turns out that the nu- 
merical solution will also preserve the constant amplitude, but the velocity depends 
on the mesh parameters Af and Ax. 

The solution (2.76) will be influenced by boundary conditions when the parts 
$1 (x — ct) and ŽI (x + ct) hit the boundaries and get, e.g., reflected back into the 
domain. However, when Z (x) is nonzero only in a small part in the middle of the 
spatial domain [0, L], which means that the boundaries are placed far away from the 
initial disturbance of u, the solution (2.76) is very clearly observed in a simulation. 

A useful representation of solutions of wave equations is a linear combination 
of sine and/or cosine waves. Such a sum of waves is a solution if the governing 
PDE is linear and each sine or cosine wave fulfills the equation. To ease analyti- 
cal calculations by hand we shall work with complex exponential functions instead 
of real-valued sine or cosine functions. The real part of complex expressions will 
typically be taken as the physical relevant quantity (whenever a physical relevant 
quantity is strictly needed). The idea now is to build Z (x) of complex wave compo- 
nents e!**; 

I(x) =) bpe. (2.77) 

keK 

Here, k is the frequency of a component, K is some set of all the discrete k values 
needed to approximate /(x) well, and b, are constants that must be determined. 
We will very seldom need to compute the b; coefficients: most of the insight we 
look for, and the understanding of the numerical methods we want to establish, 
come from investigating how the PDE and the scheme treat a single component 
eik wave. 
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Letting the number of k values in K tend to infinity, makes the sum (2.77) 
converge to /(x). This sum is known as a Fourier series representation of I(x). 
Looking at (2.76), we see that the solution u(x,t), when I(x) is represented as 
in (2.77), is also built of basic complex exponential wave components of the form 
eike) according to 


1 ik(x—c 1 ik(x+c 
u(x,t) = > D breto + =D) beet, (2.78) 
keK keK 


It is common to introduce the frequency in time w = kc and assume that u(x,t) 
is a sum of basic wave components written as e’*—®’. (Observe that inserting such 
a wave component in the governing PDE reveals that œ? = k?c?, or œw = kc, 
reflecting the two solutions: one (+kc) traveling to the right and the other (—kc) 
traveling to the left.) 


2.10.2 More Precise Definition of Fourier Representations 


The above introduction to function representation by sine and cosine waves was 
quick and intuitive, but will suffice as background knowledge for the following 
material of single wave component analysis. However, to understand all details of 
how different wave components sum up to the analytical and numerical solutions, a 
more precise mathematical treatment is helpful and therefore summarized below. 

It is well known that periodic functions can be represented by Fourier series. A 
generalization of the Fourier series idea to non-periodic functions defined on the 
real line is the Fourier transform: 


I(x) = J A(k)e'™* dk, (2.79) 


—oo 


A(k) = / Ixe dx. (2.80) 
The function A(k) reflects the weight of each wave component e’** in an infinite 
sum of such wave components. That is, A(k) reflects the frequency content in the 
function /(x). Fourier transforms are particularly fundamental for analyzing and 
understanding time-varying signals. 
The solution of the linear 1D wave PDE can be expressed as 


u(x.t)= AAE dx. 


—0o 


In a finite difference method, we represent u by a mesh function uj, where n 
counts temporal mesh points and q counts the spatial ones (the usual counter for 
spatial points, i, is here already used as imaginary unit). Similarly, Z (x) is approx- 
imated by the mesh function J,, q = 0,..., N,. On a mesh, it does not make sense 
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to work with wave components e’** for very large k, because the shortest possible 
sine or cosine wave that can be represented uniquely on a mesh with spacing Ax is 
the wave with wavelength 2Ax. This wave has its peaks and throughs at every two 
mesh points. That is, the wave “jumps up and down” between the mesh points. 

The corresponding k value for the shortest possible wave in the mesh is k = 
2n/(2Ax) = a/Ax. This maximum frequency is known as the Nyquist frequency. 
Within the range of relevant frequencies (0, 2/ Ax] one defines the discrete Fourier 
transform!', using Ny + 1 discrete frequencies: 


Ny 
1 : i2akg/(Nx+1) 
ly = Hq Ate q(Nxt+)) g =0,..., Ny, (2.81) 
k=0 
Nx 
Ak =} ere. k =0,..., Ny. (2.82) 
q=0 


The Ax values represent the discrete Fourier transform of the 74 values, which them- 
selves are the inverse discrete Fourier transform of the A, values. 

The discrete Fourier transform is efficiently computed by the Fast Fourier trans- 
form algorithm. For a real function /(x), the relevant Python code for computing 
and plotting the discrete Fourier transform appears in the example below. 


import numpy as np 
from numpy import sin, pi 


def I(x): 
return sin(2*pi*x) + 0.5*sin(4*pi*x) + 0.1*sin(6*pi*x) 


# Mesh 

L = 10; Nx = 100 

x = np.linspace(0, L, Nx+1) 
dx = L/float (Nx) 


# Discrete Fourier transform 
A = np.fft.rfft(I(x)) 
A_amplitude = np.abs(A) 


# Compute the corresponding frequencies 
freqs = np.linspace(0, pi/dx, A_amplitude.size) 


import matplotlib.pyplot as plt 
plt.plot(freqs, A_amplitude) 
plt.show() 


2.10.3 Stability 


The scheme 


[D, Du = °? D; Dru]; (2.83) 


l http://en.wikipedia.org/wiki/Discrete_Fourier_transform 
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for the wave equation us, = C’uy x allows basic wave components 


n _ ,i(kxg—@ty) 

uy =e E 

as solution, but it turns out that the frequency in time, @, is not equal to the exact 
frequency w = kc. The goal now is to find exactly what @ is. We ask two key 
questions: 


e How accurate is @ compared to w? 
e Does the amplitude of such a wave component preserve its (unit) amplitude, as 
it should, or does it get amplified or damped in time (because of a complex @)? 


The following analysis will answer these questions. We shall continue using q as 
an identifier for a certain mesh point in the x direction. 


Preliminary results A key result needed in the investigations is the finite differ- 
ence approximation of a second-order derivative acting on a complex wave compo- 


nent: 4 A 
; t i 
ppl CAN sin2 (5 ) eionht , 


By just changing symbols (w > k, t > x, n — q) it follows that 


l 4 kAx\ , 
D, Dreit], = =— sin? | er, 
[ xx Ja A x2 2 
Numerical wave propagation Inserting a basic wave component u? = e!(&%~n) 
in (2.83) results in the need to evaluate two expressions: 
[D,D ei ei" ; = [D Dje t f etA 
4 .(ÕAt iðnAt „ikqAx 
= ——~ sin | —— |e eae (2.84) 
At? ( 2 ) 
[D D el kx pat n [D D e] e iandt 
En q xix q 
4 KAX dnis 9A 
=- zi (> eee (2.85) 
x 


Then the complete scheme, 
[D,D ei ei" = eD, Dee 


leads to the following equation for the unknown numerical frequency æ (after di- 
viding by —e’** e~#®*): 


or 


AIN kA 
sin? (=) = C’ sin? (>) ; (2.86) 


160 2 Wave Equations 


where A 
cAt 
Ç = — 2.87 
a (2.87) 
is the Courant number. Taking the square root of (2.86) yields 
DAt kA 
sin (=) = Csin (>) ; (2.88) 


Since the exact w is real it is reasonable to look for a real solution @ of (2.88). The 
right-hand side of (2.88) must then be in [—1, 1] because the sine function on the 
left-hand side has values in [—1, 1] for real ©. The magnitude of the sine function 
on the right-hand side attains the value 1 when 


kAx á x z 
23 `? +m, med. 

With m = 0 we have kAx = x, which means that the wavelength A = 27x/k 
becomes 2Ax. This is the absolutely shortest wavelength that can be represented 
on the mesh: the wave jumps up and down between each mesh point. Larger values 
of |m] are irrelevant since these correspond to k values whose waves are too short 
to be represented on a mesh with spacing Ax. For the shortest possible wave in the 
mesh, sin (kAx/2) = 1, and we must require 


Cat, (2.89) 


Consider a right-hand side in (2.88) of magnitude larger than unity. The solution 
® of (2.88) must then be a complex number © = ©, +i@; because the sine function 
is larger than unity for a complex argument. One can show that for any œw; there will 
also be a corresponding solution with —w;. The component with œw; > 0 gives an 
amplification factor e®'’ that grows exponentially in time. We cannot allow this and 
must therefore require C < 1 as a stability criterion. 


Remark on the stability requirement 

For smoother wave components with longer wave lengths per length Ax, (2.89) 
can in theory be relaxed. However, small round-off errors are always present in 
a numerical solution and these vary arbitrarily from mesh point to mesh point 
and can be viewed as unavoidable noise with wavelength 2Ax. As explained, 
C > 1 will for this very small noise lead to exponential growth of the shortest 
possible wave component in the mesh. This noise will therefore grow with time 
and destroy the whole solution. 


2.10.4 Numerical Dispersion Relation 


Equation (2.88) can be solved with respect to æ: 


ee ed Csi kAx (2.90) 
w = AS sın Ea m k 
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The relation between the numerical frequency @ and the other parameters k, c, 
Ax, and At is called a numerical dispersion relation. Correspondingly, w = kc is 
the analytical dispersion relation. In general, dispersion refers to the phenomenon 
where the wave velocity depends on the spatial frequency (k, or the wave length 
À = 22/k) of the wave. Since the wave velocity is w/k = c, we realize that the 
analytical dispersion relation reflects the fact that there is no dispersion. However, 
in a numerical scheme we have dispersive waves where the wave velocity depends 
onk. 

The special case C = 1 deserves attention since then the right-hand side of 


(2.90) reduces to 
2 kAx 1 @Ax o 


Mo? ae Ee 

That is, © = w and the numerical solution is exact at all mesh points regardless of 
Ax and At! This implies that the numerical solution method is also an analytical 
solution method, at least for computing u at discrete points (the numerical method 
says nothing about the variation of u between the mesh points, and employing the 
common linear interpolation for extending the discrete solution gives a curve that 
in general deviates from the exact one). 

For a closer examination of the error in the numerical dispersion relation when 
C < 1, we can study © — w, @/«, or the similar error measures in wave velocity: 
č —c and ¢/c, where c = w/k and č = @/k. It appears that the most convenient 
expression to work with is C/c, since it can be written as a function of just two 
parameters: 


a1 


1 
= Gn (C sin p), 


with p = kAx/2 as a non-dimensional measure of the spatial frequency. In 
essence, p tells how many spatial mesh points we have per wave length in space 
for the wave component with frequency k (recall that the wave length is 27x/ k). 
That is, p reflects how well the spatial variation of the wave component is resolved 
in the mesh. Wave components with wave length less than 2Ax (27x/k < 2Ax) are 
not visible in the mesh, so it does not make sense to have p > 2/2. 

We may introduce the function r(C, p) = ¢/c for further investigation of nu- 
merical errors in the wave velocity: 


r(C,p)= gi (Csinp), C e(0,1], p €e 0,7/2]. (2.91) 


This function is very well suited for plotting since it combines several parameters 
in the problem into a dependence on two dimensionless numbers, C and p. 
Defining 


def r(C, p): 
return 2/(C*p)*asin(C*sin(p)) 


we can plot r (C, p) as a function of p for various values of C, see Fig. 2.6. Note 
that the shortest waves have the most erroneous velocity, and that short waves move 
more slowly than they should. 
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Fig.2.6 The fractional error in the wave velocity for different Courant numbers 


We can also easily make a Taylor series expansion in the discretization parameter 
p: 


>>> import sympy as sym 

>>> C, p = sym.symbols(’C p’) 

>>> # Compute the 7 first terms around p=0 with no O() term 
>>> rs = r(C, p).series(p, 0, 7).remove0() 

>>> rs 

pxx6* (5*C**6/112 - C**4/16 + 13*C**2/720 - 1/5040) + 

px*4* (3*C**4/40 - Cx*2/12 + 1/120) + 

pe*2* (C#*2/6 - 1/6) + 1 


>>> # Pick out the leading order term, but drop the constant 1 
>>> rs_error_leading_order = (rs - 1).extract_leading_order (p) 
>>> rs_error_leading_order 

p**2* (C**2/6 - 1/6) 


>>> # Turn the series expansion into a Python function 
>>> rs_pyfunc = lambdify([C, p], rs, modules=’numpy’) 


>>> # Check: rs_pyfunc is exact (=1) for C=1 
>>> rs_pyfunc(1, 0.1) 
1.0 


Note that without the . remove0 () call the series gets an 0(x**7) term that makes 
it impossible to convert the series to a Python function (for, e.g., plotting). 
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From the rs_error_leading_order expression above, we see that the leading 
order term in the error of this series expansion is 


2 


2 


6 2 
pointing to an error O(A?t?, Ax”), which is compatible with the errors in the differ- 
ence approximations (D, D;u and Dy Dyu). 

We can do more with a series expansion, e.g., factor it to see how the factor C — 1 
plays a significant role. To this end, we make a list of the terms, factor each term, 
and then sum the terms: 


>>> rs = r(C, p).series(p, 0, 4).remove0().as_ordered_terms() 
>>> rs 

[1, Cx*2*p**2/6 - p**2/6, 

3*Cx*4*xp*%4/40 - C#*2*p**4/12 + px*4/120, 

5*Cx*6*xp**6/112 - C**4*p**6/16 + 13*C**2*p**6/720 - p**6/5040] 
>>> rs = [factor(t) for t in rs] 

>>> rs 

[eps +2 (Cel) (Care DS 

p**4*(C - 1)*(C + 1)*(3#C - 1)*(3*C + 1)/120, 

p*x*6*(C - 1)*(C + 1)*(225*C#*4 - 90*C#*2 + 1)/5040] 

>>> rs = sum(rs) # Python’s sum function sums the list 

>>> rs 
px*6*(C - 1)*(C + 1)*(225*C#*4 - 90*C#*2 + 1)/5040 + 
p**4*(C - 1)*(C + 1)*(3*C - 1)*(3*C + 1)/120 + 
p*e*2*(C - 1)*(C + 1)/6 +1 


We see from the last expression that C = | makes all the terms in rs vanish. Since 
we already know that the numerical solution is exact for C = 1, the remaining 
terms in the Taylor series expansion will also contain factors of C — | and cancel 
forC = 1. 

2.10.5 Extending the Analysis to 2D and 3D 

The typical analytical solution of a 2D wave equation 

2 
Ure = C (Uxx + uyy), 


is a wave traveling in the direction of k = kyi +k, j , where i and j are unit vectors 


in the x and y directions, respectively (¢ should not be confused with i = /—1 
here). Such a wave can be expressed by 


u(x, y,t) = g(kxx + kyy — kct) 
for some twice differentiable function g, or with w = kc, k = |k|: 


u(x, y,t) = g(kxx + kyy — ot). 
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We can, in particular, build a solution by adding complex Fourier components of 
the form 
elikxxtky y—ot)) , 
A discrete 2D wave equation can be written as 
[D;D;u = c° (D; Dyu + D; Dyu)l; +- (2.93) 


This equation admits a Fourier component 


= i(kxgAx+kyrAy—@nAt 
Seno), (2.94) 


as solution. Letting the operators D; D+, D, D,, and D, Dy act on Ug. from (2.94) 
transforms (2.93) to 


4 . , [ÕAt 2 4 14 (KxAx > 4 3 [ky Ay 
Ar sin ( 5 )=e Ax sin 5 +c a” ca (2.95) 


or 
oAt 
sin? (=) = Cf sin? py + C$ sin’ py, (2.96) 
where we have eliminated the factor 4 and introduced the symbols 
cAt cAt k, Ax k, Ay 
CG = ——_, C.. = — x = —, = 
l Ax i Ay K 2 Ps 2 


For a real-valued æ the right-hand side must be less than or equal to unity in absolute 
value, requiring in general that 


C+C <1. (2.97) 


This gives the stability criterion, more commonly expressed directly in an inequality 
for the time step: 


1/1 i wes 
At < a (x + =) ; (2.98) 
A similar, straightforward analysis for the 3D case leads to 
if i 1 Lo 
At < ~ law + hye + Ag : (2.99) 


In the case of a variable coefficient c? = c?(x), we must use the worst-case value 
č = ,/max c?(x) (2.100) 
Q 


in the stability criteria. Often, especially in the variable wave velocity case, it is 
wise to introduce a safety factor B € (0, 1] too: 


-1/2 
EN e EE EE. (2.101) 
= ē (Ax? Ay? Az ` l 
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The exact numerical dispersion relations in 2D and 3D becomes, for constant c, 


2 1 
ð= n sin”! (e sin? py + C$ sin? Py) J , (2.102) 


Sr 


2 5 
= re; sin! e sin? Px t+ C sin? Py + C- sin? p:) i ; (2.103) 


We can visualize the numerical dispersion error in 2D much like we did in 1D. 
To this end, we need to reduce the number of parameters in @. The direction of the 
wave is parameterized by the polar angle 0, which means that 


kx =ksin@, ky =kcosé. 


A simplification is to set Ax = Ay = h. Then Cy = C, = cAt/h, which we call 
C. Also, 


1 1 
P= an cos, py = ru sin 6. 
The numerical frequency @ is now a function of three parameters: 


e C, reflecting the number of cells a wave is displaced during a time step, 
e p= skh, reflecting the number of cells per wave length in space, 
e 0, expressing the direction of the wave. 


We want to visualize the error in the numerical frequency. To avoid having At as a 
free parameter in ©, we work with ¢/c = @/(kc). The coefficient in front of the 
sin”! factor is then 

2 2 1 2 


kcAt 2kcAth/h  Ckh Cp’ 


and s 
f= os sin”! (c (sin? (p cos 0) + sin?(p sin 0)) ‘) . 
c Cp 
We want to visualize this quantity as a function of p and @ for some values of 
C < 1. It is instructive to make color contour plots of 1 — ¢/c in polar coordinates 
with 0 as the angular coordinate and p as the radial coordinate. 

The stability criterion (2.97) becomes C < Cmax = 1/ /2 in the present 2D 
case with the C defined above. Let us plot 1 — ¢/c in polar coordinates for 
Cmax, O-9C max, 0-5 Cmax, 0-2Cmax. The program below does the somewhat tricky 
work in Matplotlib, and the result appears in Fig. 2.7. From the figure we clearly 
see that the maximum C value gives the best results, and that waves whose propa- 
gation direction makes an angle of 45 degrees with an axis are the most accurate. 
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def dispersion_relation_2D(p, theta, C): 
arg = C*sqrt(sin(p*cos(theta) )**2 + 
sin(p*sin(theta) ) **2) 
c_frac = 2./(C*p)*arcsin(arg) 


return c_frac 


import numpy as np 
from numpy import \ 
cos, sin, arcsin, sqrt, pi # for nicer math formulas 


r = p = np.linspace(0.001, pi/2, 101) 
theta = np.linspace(0, 2*pi, 51) 
r, theta = np.meshgrid(r, theta) 


# Make 2x2 filled contour plots for 4 values of C 
import matplotlib.pyplot as plt 
C_max = 1/sqrt (2) 
C = [[C_max, 0.9*C_max], [0.5*C_max, 0.2*C_max]] 
fix, axes = plt.subplots(2, 2, subplot_kw=dict (polar=True) ) 
for row in range(2): 
for column in range(2): 
error = 1 - dispersion_relation_2D( 
p, theta, C[row] [column] ) 
print error.min(), error.max() 
# use vmin=error.min(), vmax=error.max() 
cax = axes[row] [column] .contourf ( 
theta, r, error, 50, vmin=-1, vmax=-0.28) 
axes [row] [column] .set_xticks([]) 
axes [row] [column] .set_yticks([]) 


# Add colorbar to the last plot 

cbar = plt.colorbar(cax) 

cbar.ax.set_ylabel(’error in wave velocity’) 
plt.savefig(’disprel2D.png’); plt.savefig(’disprel2D.pdf’) 
plt.show() 


Fig. 2.7 Error in numerical dispersion in 2D 
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2.11 Finite Difference Methods for 2D and 3D Wave Equations 

A natural next step is to consider extensions of the methods for various variants of 
the one-dimensional wave equation to two-dimensional (2D) and three-dimensional 
(3D) versions of the wave equation. 


2.11.1 Multi-Dimensional Wave Equations 


The general wave equation in d space dimensions, with constant wave velocity c, 
can be written in the compact form 


3u 


r~ ’V7u forx eR cR’, te(0,T], (2.104) 
where 
Vy = 3u r 3u 
"= Jx dy?’ 


3u P du i: 3u 
~ Ox2 "dy? əz?’ 


in three space dimensions (d = 3). 
Many applications involve variable coefficients, and the general wave equation 
in d dimensions is in this case written as 
3u 


ozz =V (Vu) + f forx € Qc R”, te(0,T], (2.105) 


which in, e.g., 2D becomes 


3u a du a du 
olx Waa ae (ac) + D (aa) + f(x, y,t). (2.106) 


To save some writing and space we may use the index notation, where subscript t, 
x, or y means differentiation with respect to that coordinate. For example, 


3u 
ðt? 


ð ðu 
ay (ac. ne) = (quy)y : 


These comments extend straightforwardly to 3D, which means that the 3D versions 
of the two wave PDEs, with and without variable coefficients, can be stated as 


= Utt, 


Urt = C’ (Uxx + Uyy + zz) + f, (2.107) 
OUr = (qux)x + (Quy)y + (qu-): + f. (2.108) 
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At each point of the boundary 02 (of 2) we need one boundary condition in- 
volving the unknown u. The boundary conditions are of three principal types: 


1. u is prescribed (u = 0 or a known time variation of u at the boundary points, 
e.g., modeling an incoming wave), 

2. du/dn = n - Vu is prescribed (zero for reflecting boundaries), 

3. an open boundary condition (also called radiation condition) is specified to let 
waves travel undisturbed out of the domain, see Exercise 2.12 for details. 


All the listed wave equations with second-order derivatives in time need two initial 
conditions: 


l. u= I, 
u, = V. 


2.11.2 Mesh 
We introduce a mesh in time and in space. The mesh in time consists of time points 
to =0< tı <ie < ÉN, 


normally, for wave equation problems, with a constant spacing At = tn+1 — th, 
nET 
Finite difference methods are easy to implement on simple rectangle- or box- 
shaped spatial domains. More complicated shapes of the spatial domain require 
substantially more advanced techniques and implementational efforts (and a fi- 
nite element method is usually a more convenient approach). On a rectangle- or 
box-shaped domain, mesh points are introduced separately in the various space di- 
rections: 
Xo < X; <+: < Xy, in the x direction, 
Yo < yı <+: < yn, in the y direction, 


Zo < Zı <+: < Zy, in the z direction. 


We can write a general mesh point as (x;, Yj, Zk, tn), with i € I, 7 E€ 1, k ek, 
andn € 1,. 

It is a very common choice to use constant mesh spacings: Ax = Xi+1 — Xj, 
i e, Ay = Yj+ı— Yj, j €T,, and Az = 241 — Zk, k € 17. With equal mesh 
spacings one often introduces h = Ax = Ay = Az. 

The unknown u at mesh point (x;, Yj, Zk, tn) is denoted by u? ,,. In 2D problems 
we just skip the z coordinate (by assuming no variation in that olon: d/dz = 0) 


and write u} ;. 
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2.11.3 Discretization 
Two- and three-dimensional wave equations are easily discretized by assembling 
building blocks for discretization of 1D wave equations, because the multi- 
dimensional versions just contain terms of the same type as those in 1D. 
Discretizing the PDEs Equation (2.107) can be discretized as 

[D,D,u = ° (D, Dyu + Dy Dyu + D,Dzu) + fjr (2.109) 
A 2D version might be instructive to write out in detail: 

[D;D,u = c*(D,D,u + Dy Dyu) + fT} ;, 

which becomes 


n+l n n—1 n n 
u; ; — Qu; tuy Qu; +u; 


n 
ij ag Yin j T iig 
At? Ax? 
n n n 
auij+ T 2u j tuija i 
+c uE 
Ay? ij 


Assuming, as usual, that all values at time levels n and n — 1 are known, we can 
solve for the only unknown upt, The result can be compactly written as 


ujt! = Qu? + uly! +e At?[DxDyu + DyDyul} ;. (2.110) 
As in the 1D case, we need to develop a special formula for ul j Where we 


n+l 


combine the general scheme for u; J> when n = 0, with the discretization of the 


initial condition: 
[Dau = V]; > up; =u; 2AtV;j. 


The result becomes, in compact form, 
1 
ul, = Uy; —2AV; j + 50 AU [De Deu + Dy Dyu]? ; . (2.111) 


The PDE (2.108) with variable coefficients is discretized term by term using the 
corresponding elements from the 1D case: 


[oD,D,u = (D;7* Dyu + Dyg” Dyu + DzG Dau) + fjg (2.112) 
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When written out and solved for the unknown vane one gets the scheme 

yt a ys! a 2u” 

ijk 7 i,j,k j 


i,j,k 
1 1 


l n n 
Qij Ax? (za F qisi jk) Uii jk — Wij) 


1 n n 
a 5 Gi-ik T qi jk) Ui jk — Ww) 


1 1 1 A j 
TN (Ses + qi jt UF japik T Mi ja) 

i,j,k 

1 n n 

= 5 Gis-ik + ij UF jk yaa) 

1 1 1 n n 
air ziik + Gif c+ UF ja — Ui jk) 

GJ 


1 n n 
= zik- + qi jk) UF jk z wy) 


+ At fig. 


Also here we need to develop a special formula for Ur ik by combining the 
scheme for n = 0 with the discrete initial condition, which is just a matter of 
inserting Wi ik = Ui ik — 2At V; jx in the scheme and solving for ul g 
Handling boundary conditions where u is known The schemes listed above are 
valid for the internal points in the mesh. After updating these, we need to visit all 
the mesh points at the boundaries and set the prescribed u value. 


Discretizing the Neumann condition The condition du/dn = 0 was imple- 
mented in 1D by discretizing it with a Dyu centered difference, followed by 
eliminating the fictitious u point outside the mesh by using the general scheme 
at the boundary point. Alternatively, one can introduce ghost cells and update a 
ghost value for use in the Neumann condition. Exactly the same ideas are reused in 
multiple dimensions. 

Consider the condition du/dn = 0 at a boundary y = 0 of a rectangular domain 
[0, Lx] x [0, Ly] in 2D. The normal direction is then in — y direction, so 


ou = ou 
ð3n dy’ 
and we set 
[-Dayu = 0]? 3 uji — uja =0 
2ye = N10 ~ 2Ay es 


From this it follows that uř_; = u74. The discretized PDE at the boundary point 
(i, 0) reads 


n+l n n—l n n n n n n 
uio — 2Uj9 turo 24 +10 — 249 + Uio 2 4i — 2Ujg tuj ñ 
7 = 2 +c 7 + Lje 
At Ax Ay ‘ 
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We can then just insert w/, for w7_, in this equation and solve for the boundary 


value TA , just as was jones in 1D. 


From "these calculations, we see a pattern: the general scheme applies at the 
boundary j = 0 too if we just replace j — 1 by j + 1. Such a pattern is particu- 
larly useful for implementations. The details follow from the explained 1D case in 
Sect. 2.6.3. 

The alternative approach to eliminating fictitious values outside the mesh is to 
have u?_, available as a ghost value. The mesh is extended with one extra line 
(2D) or plane (3D) of ghost cells at a Neumann boundary. In the present example it 
means that we need a line with ghost cells below the y axis. The ghost values must 


be updated according to w= sige 


2.12 Implementation 


We shall now describe in detail various Python implementations for solving a stan- 
dard 2D, linear wave equation with constant wave velocity and u = 0 on the 
boundary. The wave equation is to be solved in the space-time domain 2 x (0, T], 
where (2 = (0, Lx) x (0, Ly) is a rectangular spatial domain. More precisely, the 
complete initial-boundary value problem is defined by 


Use = C7 (Uxx + Uyy) + f(x,y, 2), (x,y) € 2, t € (0,T], (2.113) 


u(x, y,0) = I(x, y), (x,y) € Q, (2.114) 
u(x, y,0) = V(x, y), Gay) eR, (2.115) 
u=0, (x, y) € 82, t € (0, T], (2.116) 


where 02 is the boundary of S2, in this case the four sides of the rectangle 2 = 
[0, Lx] x [0, Ly]: x = 0, x = Ly, y = 0, and y = Ly. 
The PDE is discretized as 


[D;D,u = c?(DyD,u + D, Dyu) + fi 


Esp? 
which leads to an explicit updating formula to be implemented in a program: 
n+l _ ys} n 
Ujj = Ui + 2u; 
ee T - uj + Yj_ rapt Oey 2u; j + u; j- 1) 
+ At? fii (2.117) 


for all interior mesh points i € 7! and j € J}, forn € 1}. The constants Cy and Cy 
are defined as 


At At 
C; =c—, Cy =c— 
ax OTTA 
At the P we simply set ur; = = Ofori = 0, j = 0,..., Ny; i = Ny, 
j = 0,..., N; j = 0,7 = 0,. Nowy = Net SO. N. For the 


first NED: n = 0, (2.117) is conibined siih the discretization of the macal condition 
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u, = V, [Duu = v]? j to obtain a special formula for ul j at the interior mesh 
points: 


ii = uh + AtV, ; 


1 1 
T 5 Cris, = Qu? j F uy) + 5 Cy Cija = Que, T u? j) 


1 2 gn 
+ zA fi J 
(2.118) 
The algorithm is very similar to the one in 1D: 


Set initial condition u? j =1Qi,y;) 

Compute ul, from (2.117) 

Set ul, = 0 for the boundaries i = 0, Ny, 7 = 0, Ny 

. Forn = 1,2,...,N;: 

(a) Find a from (2.117) for all internal mesh points, i € Z}, j € 7 


(b) Set a = 0 for the boundaries i = 0, Ny, j = 0, Ny 


ASNH 


2.12.1 Scalar Computations 


The solver function for a 2D case with constant wave velocity and boundary 
condition u = 0 is analogous to the 1D case with similar parameter values (see 
wave1D_u0. py), apart from a few necessary extensions. The code is found in the 
program wave2D_u0. py. 


Domain and mesh The spatial domain is now [0, L,] x [0, L,], specified by the 
arguments Lx and Ly. Similarly, the number of mesh points in the x and y direc- 
tions, N, and N,, become the arguments Nx and Ny. In multi-dimensional problems 
it makes less sense to specify a Courant number since the wave velocity is a vector 
and mesh spacings may differ in the various spatial directions. We therefore give 
At explicitly. The signature of the solver function is then 


det solvon i. Wy te, Gy L2 Lys Nz. Ny. Chey Ip 
user_action=None, version=’scalar’): 


Key parameters used in the calculations are created as 


x = linspace(0, Lx, Nx+1) # mesh points in x dir 
y = linspace(0, Ly, Ny+1) # mesh points in y dir 
dx = x[1] = x[0] 

dy = y[1] - y[0] 

Nt = int (round(T/float(dt))) 


t = linspace(O, N*dt, N+1) # mesh points in time 
Cx2 = (c*dt/dx)**2; Cy2 = (c*dt/dy)**2 # help variables 
dt2 = dt**2 
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n+l n 


Solution arrays We store u; j Ung and u? z in three two-dimensional arrays, 


i= 
| 


= zeros((Nx+1,Ny+1))  # solution array 
u_n = [zeros((Nx+1,Ny+1)), zeros((Nx+1,Ny+1))] # t-dt, t-2*dt 


n+l 


where uij corresponds to u [i,j], u; j tou_n[i,j],and uly! to u_nmi [i,j]. 


Index sets It is also convenient to introduce the index sets (cf. Sect. 2.6.4) 


Ix = range(0, u.shape[0]) 
Iy = range(0, u.shape[1]) 
It = range(0, t.shape[0]) 


Computing the solution Inserting the initial condition I in u_n and making a 
callback to the user in terms of the user_action function is a straightforward 
generalization of the 1D code from Sect. 2.1.6: 


for i in Iz: 
for j in Iy: 
T E eE s7lla) 


if user_action is not None: 
user_action(u_n, x, xv, y, yv, t, 0) 


The user_action function has additional arguments compared to the 1D case. The 
arguments xv and yv will be commented upon in Sect. 2.12.2. 

The key finite difference formula (2.110) for updating the solution at a time level 
is implemented in a separate function as 


deffadvance scalar (ul u m, u nmi Ex. ys C n. Cx2 miCy2,midbt2), 
V=None, step1=False): 
Ix = range(0, u.shape[0]); Iy = range(0, u.shape[1]) 
if stepi: 
dt = sqrt(dt2) # save 
Cx2 = 0.5*Cx2; Cy2 = 0.5xCy2; dt2 = 0.5*dt2 # redefine 
Di=1; D2=0 
else: 
Dios) D2) = 4 
for i in Izi: 1: 
formimin y EE § 
u_xx = u_n[i-1,j] - 2*u_n[i,j] + u_n[it1,j] 
uyy = ulna, e 24unla,5)] + ulna, j+1] 
uli,j] = Di*u_n[i,j] - D2*u_nmi[i,j] + \ 
Cx2*u_xx + Cy2*u_yy + dt2*f(x[i], yj], tim) 
if stepi: 
uli,j] += dt*V(xfil, ylj]) 
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# Boundary condition u=0 
j = Iy[0] 

for i in Ix: ullinj] 
j = Iy[-1] 

fori in zo inl 
i = Ix[0] 

for j in Iy: uli,j] = 
i = Ix[-1] 

for j in Ly: ult, 9l) = 
return u 


0) 


0 


| 
fo} 


| 
lo} 


The step1 variable has been introduced to allow the formula to be reused for the 
first step, computing ul 7 


u = advance_scalar(u, u_n, f, x, y, t, 
n, Cx2, Cy2, dt, V, stepi=True) 


Below, we will make many alternative implementations of the advance_scalar 
function to speed up the code since most of the CPU time in simulations is spent in 
this function. 


Remark: How to use the solution 

The solver function in the wave2D_u0.py code updates arrays for the next 
time step by switching references as described in Sect. 2.4.5. Any use of u on 
the user’s side is assumed to take place in the user action function. However, 
should the code be changed such that u is returned and used as solution, have in 
mind that you must return u_n after the time limit, otherwise a return u will 
actually return u_nm1 (due to the switching of array indices in the loop)! 


2.12.2 Vectorized Computations 


The scalar code above turns out to be extremely slow for large 2D meshes, and prob- 
ably useless in 3D beyond debugging of small test cases. Vectorization is therefore 
a must for multi-dimensional finite difference computations in Python. For exam- 
ple, with a mesh consisting of 30 x 30 cells, vectorization brings down the CPU 
time by a factor of 70 (!). Equally important, vectorized code can also easily be 
parallelized to take (usually) optimal advantage of parallel computer platforms. 

In the vectorized case, we must be able to evaluate user-given functions like 
I(x, y) and f(x, y,t) for the entire mesh in one operation (without loops). These 
user-given functions are provided as Python functions I(x,y) and f(x,y,t), re- 
spectively. Having the one-dimensional coordinate arrays x and y is not sufficient 
when calling I and f in a vectorized way. We must extend x and y to their vectorized 
versions xv and yv: 


from numpy import newaxis 
xv = x[:,newaxis] 
yv = y[newaxis,:] 
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# or 
xv = x.reshape((x.size, 1)) 
yv = y.reshape((1, y.size)) 


This is a standard required technique when evaluating functions over a 2D mesh, say 
sin(xv) *cos(xv), which then gives a result with shape (Nx+1,Ny+1). Calling 
I(xv, yv) and f(xv, yv, t[n]) will now return I and f values for the entire 
set of mesh points. 

With the xv and yv arrays for vectorized computing, setting the initial condition 
is just a matter of 


uak E Gar, yan) 


One could also have written un = I(xv, yv) and let u_n point to a new object, 
but vectorized operations often make use of direct insertion in the original array 


through u_n[: , :], because sometimes not all of the array is to be filled by such a 


function evaluation. This is the case with the computational scheme for ne 


def advance_vectorized(u, u_n, u_nmi, f_a, Cx2, Cy2, dt2, 
V=None, step1=False): 
if stepi: 
dt = np.sqrt(dt2) # save 
Cx2 = 0.5*Cx2; Cy2 = 0.5*Cy2; dt2 = 0.5*dt2 # redefine 
Di=1; D2=0 
else: 
Di=2; D2=1 
u_xx = u_n[:-2,1:-1] - 2*u_n[1:-1,1:-1] + u_n[2:,1:-1] 
wya = onlie el] = Pen soleil aleaall) se wi an|leaal wei 
uli:-1,1:-1] = Di*#u_n[1:-1,1:-1] - D2*u_nmi[1:-1,1:-1] + \ 
Cx2*u_xx + Cy2*u_yy + dt2*f_a[1:-1,1:-1] 
if stepi: 
u[i:-1,1:-1] += dt#V[1:-1, 1:-1] 
# Boundary condition u=0 


j=0 
wa = © 

j = u.shape[1]-1 
w S O 

i=0 

uli,:] = 0 

i = u.shape[0]-1 
uLli,:] =0 
return u 


Array slices in 2D are more complicated to understand than those in 1D, but 
the logic from 1D applies to each dimension separately. For example, when doing 
ujj —Uj_1; fori € J% , we just keep j constant and make a slice in the first index: 
u_n[1:,j] - u_n[:-1,j], exactly asin 1D. The 1: slice specifies all the indices 
i = 1,2,..., N, (up to the last valid index), while : -1 specifies the relevant indices 
for the second term: 0, 1,..., Ny — 1 (up to, but not including the last index). 

In the above code segment, the situation is slightly more complicated, because 


each displaced slice in one direction is accompanied by a 1:-1 slice in the other 
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direction. The reason is that we only work with the internal points for the index that 
is kept constant in a difference. 

The boundary conditions along the four sides make use of a slice consisting of 
all indices along a boundary: 


YL sol 
ul: ,Ny] 
me) 3] 
u[Nx,:] 


So oe 


In the vectorized update of u (above), the function f is first computed as an array 
over all mesh points: 


AS Gao yoo EEN) 


We could, alternatively, have used the call f (xv, yv, t[n])[1:-1,1:-1] in the 
last term of the update statement, but other implementations in compiled languages 
benefit from having f available in an array rather than calling our Python function 
f(x,y,t) for every point. 

Also in the advance_vectorized function we have introduced a boolean 
step1 to reuse the formula for the first time step in the same way as we did with 
advance_scalar. We refer to the solver function in wave2D_u0.py for the 
details on how the overall algorithm is implemented. 

The callback function now has the arguments u, x, xv, y, yv, t, n. 
The inclusion of xv and yv makes it easy to, e.g., compute an exact 2D so- 
lution in the callback function and compute errors, through an expression like 
u - u_exact(xv, yv, t[n]). 


2.12.3 Verification 


Testing a quadratic solution The 1D solution from Sect. 2.2.4 can be generalized 
to multi-dimensions and provides a test case where the exact solution also fulfills 
the discrete equations, such that we know (to machine precision) what numbers 
the solver function should produce. In 2D we use the following generalization of 
(2.30): 


ue(x, y, t) = x(Lx — x)y(L, — y) (: + 31) : (2.119) 


This solution fulfills the PDE problem if I(x, y) = ue(x, y,0), V = sue(x, y,0), 
and f = 2c?(1+ 5t)(y(Ly — y) + x(Lx — x)). To show that ve also solves the 
discrete equations, we start with the general results [D;D,1]" = 0, [D,D,t]" = 0, 
and [D,D;t?] = 2, and use these to compute 


n 


ij 


[Dx Dxue]; ; = þa, T y) (: F 3!) D,D,x (Ly T »| 


1 
= yj(Ly — yj) (1 + 5n) (—2). 
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A similar calculation must be carried out for the [D, D,ue ij and [D; Drue]; ; 
terms. One must also show that the quadratic solution fits the special formula for 
ul j- The details are left as Exercise 2.16. The test_quadratic function in the 
wave2D_u0.py program implements this verification as a proper test function for 
the pytest and nose frameworks. 


2.12.4 Visualization 


Eventually, we are ready for a real application with our code! Look at the 
wave2D_u0.py and the gaussian function. It starts with a Gaussian function 
to see how it propagates in a square with u = 0 on the boundaries: 


def gaussian(plot_method=2, version=’vectorized’, save_plot=True): 
woe 
Initial Gaussian bell in the middle of the domain. 
plot_method=1 applies mesh function, 
=2 means surf, =3 means Matplotlib, =4 means mayavi, 
=0 means no plot. 
woe 
# Clean up plot files 
for name in glob(’tmp_*.png’): 
os.remove (name) 


Lx = 10 
Ly = 10 
c= 1.0 


from numpy import exp 


def I(x, y): 
"""Gaussian peak at (Lx/2, Ly/2).""" 
return exp(-0.5*(x-Lx/2.0)**2 - 0.5*(y-Ly/2.0)**2) 


Ghose jolliohe WG, xrv. Wo Why iy DE 
"""User action function for plotting.""" 


Nx = 40; Ny = 40; T = 20 
dt, cpu = solver(I, None, None, c, Lx, Ly, Nx, Ny, -1, T, 
user_action=plot_u, version=version) 


Matplotlib We want to animate a 3D surface in Matplotlib, but this is a really slow 
process and not recommended, so we consider Matplotlib not an option as long as 
on-screen animation is desired. One can use the recipes for single shots of u, where 
it does produce high-quality 3D plots. 


Gnuplot Let us look at different ways for visualization. We import SciTools as 
st and can access st.mesh and st. surf in Matplotlib or Gnuplot, but this is not 
supported except for the Gnuplot package, where it works really well (Fig. 2.8). 
Then we choose plot_method=2 (or less relevant plot_method=1) and force the 
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t20 1=1.59099 


t=3.18198 t=19.4454 


ò 
Hino & + 


Fig. 2.8 Snapshots of the surface plotted by Gnuplot 


backend for SciTools to be Gnuplot (if you have the C package Gnuplot and the 
Gnuplot. py Python interface module installed): 


Terminal 


Terminal> python wave2D_u0.py --SCITOOLS_easyviz_backend gnuplot 


It gives a nice visualization with lifted surface and contours beneath. Figure 2.8 
shows four plots of u. 
Video files can be made of the PNG frames: 


Terminal 


Terminal> ffmpeg -i tmp_/04d.png -r 25 -vcodec flv movie.flv 
Terminal> ffmpeg -i tmp_/04d.png -r 25 -vcodec 1inx264 movie.mp4 
Terminal> ffmpeg -i tmp_/04d.png -r 25 -vcodec libvpx movie.webm 
Terminal> ffmpeg -i tmp_/04d.png -r 25 -vcodec libtheora movie.ogg 


It is wise to use a high frame rate — a low one will just skip many frames. There 
may also be considerable quality differences between the different formats. 


Movie 1 https://raw.githubusercontent.com/hplgit/fdm-book/master/doc/pub/book/html/ 
mov-wave/gnuplot/wave2D_u0_gaussian/movie25.mp4 
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Mayavi The best option for doing visualization of 2D and 3D scalar and vector 
fields in Python programs is Mayavi, which is an interface to the high-quality pack- 
age VTK in C++. There is good online documentation and also an introduction in 
Chapter 5 of [10]. 

To obtain Mayavi on Ubuntu platforms you can write 


Terminal 


pip install mayavi --upgrade 


For Mac OS X and Windows, we recommend using Anaconda. To obtain Mayavi 
for Anaconda you can write 


Terminal 


conda install mayavi 


Mayavi has a MATLAB-like interface called mlab. We can do 


import mayavi.mlab as plt 
# or 
from mayavi import mlab 


and have plt (as usual) or mlab as a kind of MATLAB visualization access inside 
our program (just more powerful and with higher visual quality). 

The official documentation of the mlab module is provided in two places, one for 
the basic functionality!” and one for further functionality'*. Basic figure handling!* 
is very similar to the one we know from Matplotlib. Just as for Matplotlib, all 
plotting commands you do in mlab will go into the same figure, until you manually 
change to a new figure. 

Back to our application, the following code for the user action function with 
plotting in Mayavi is relevant to add. 


# Top of the file 
try: 
import mayavi.mlab as mlab 
except: 
# We don’t have mayavi 
pass 


def solver(...): 


12 http://docs.enthought.com/mayavi/mayavi/auto/mlab_helper_functions.html 
13 http://docs.enthought.com/mayavi/mayavi/auto/mlab_other_functions.html 
14 http://docs.enthought.com/mayavi/mayavi/auto/mlab_figure.html 
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def gaussian(...): 


if plot_method == 3: 


def 


from mpl_toolkits.mplot3d import axes3d 
import matplotlib.pyplot as plt 

from matplotlib import cm 

plt.ion() 

fig = plt.figure() 

u_surf = None 


ioe GM, Fy S85 Wy Wily Wy VIR 
"""User action function for plotting.""" 
if t[n] == 
time.sleep(2) 
if plot_method == 


2 Wave Equations 


# Works well with Gnuplot backend, not with Matplotlib 
st.mesh(x, y, u, title=’t=/g’ % t[n], zlim=[-1,1], 


caxis=[-1,1]) 
elif plot_method == 2: 


# Works well with Gnuplot backend, not with Matplotlib 
st.surfc(xv, yv, u, title=’t=/g’ % tin], zlim=[-1, 1], 
colorbar=True, colormap=st.hot(), caxis=[-1,1], 


shading=’flat’) 
elif plot_method == 3: 


print ’Experimental 3D matplotlib...not recommended’ 


elif plot_method == 4: 
# Mayavi visualization 
mlab.clf() 
extent1 = (0, 20, 0, 20,-2, 2) 
s = mlab.surf(x , y, U, 
colormap=’Blues’, 
warp_scale=5,extent=extent1) 


mlab.axes(s, color=(.7, .7, .7), extent=extent1, 


ranges=(0, 10, 0, 10, -1, 1), 
xlabel=’’, ylabel=’’, zlabel=’’, 
x_axis_visibility=False, 
z_axis_visibility=False) 


mlab.outline(s, color=(0.7, .7, .7), extent=extent1) 


mlab.text(6, -2.5, ’’, z=-4, width=0.14) 
mlab.colorbar(object=None, title=None, 
orientation=’horizontal’, 


nb_labels=None, nb_colors=None, 


label_fmt=None) 
mlab.title(’Gaussian t=%g’ % t[n]) 
mlab.view(142, -72, 50) 
f = mlab.gcf() 
camera = f.scene.camera 
camera. yaw(0) 


if plot_method > 0: 
time.sleep(0) # pause between frames 
if save_plot: 
filename = ’tmp_/04d.png’ % n 
if plot_method == 4: 


mlab.savefig(filename) # time consuming! 


elif plot_method in (1,2): 


st.savefig(filename) # time consuming! 
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Fig.2.9 Plot with Mayavi 


This is a point to get started — visualization is as always a very time-consuming and 
experimental discipline. With the PNG files we can use ffmpeg to create videos. 


Movie 2 https://github.com/hplgit/fdm-book/blob/master/doc/pub/book/html/mov- wave/ 
mayavi/wave2D_u0_gaussian/movie.mp4 
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Exercise 2.16: Check that a solution fulfills the discrete model 

Carry out all mathematical details to show that (2.119) is indeed a solution of the 
discrete model for a 2D wave equation with u = 0 on the boundary. One must 
check the boundary conditions, the initial conditions, the general discrete equation 
at a time level and the special version of this equation for the first time level. 
Filename: check_quadratic_solution. 


Project 2.17: Calculus with 2D mesh functions 
The goal of this project is to redo Project 2.6 with 2D mesh functions (/;,;). 


Differentiation The differentiation results in a discrete gradient function, which 
in the 2D case can be represented by a three-dimensional array df [d,i, j] where 
d represents the direction of the derivative, and i,j is a mesh point in 2D. Use 
centered differences for the derivative at inner points and one-sided forward or 
backward differences at the boundary points. Construct unit tests and write a corre- 
sponding test function. 
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Integration The integral of a 2D mesh function f; j is defined as 


V Xi 
A= / / fle, yaxdy, 


yo Xo 


where f(x, y) is a function that takes on the values of the discrete mesh function 

Jij at the mesh points, but can also be evaluated in between the mesh points. The 

particular variation between mesh points can be taken as bilinear, but this is not 

important as we will use a product Trapezoidal rule to approximate the integral over 

a cell in the mesh and then we only need to evaluate f(x, y) at the mesh points. 
Suppose F; ; is computed. The calculation of F;,,,; is then 


Xi+1 Yj 
Figi = Fij + 1 [ fo. navax 


Xi Yo 
Vj yj 
1 
N Ax; J rona + J fXi+1; y)dy 
yo yo 


The integrals in the y direction can be approximated by a Trapezoidal rule. A sim- 
ilar idea can be used to compute F; j+1. Thereafter, F;+1,j+1 can be computed by 
adding the integral over the final corner cell to Fj+1,; + Fi j+1 — F; j. Carry out 
the details of these computations and implement a function that can return F; j for 
all mesh indices 7 and j. Use the fact that the Trapezoidal rule is exact for linear 
functions and write a test function. 

Filename: mesh_calculus_2D. 


Exercise 2.18: Implement Neumann conditions in 2D 
Modify the wave2D_u0.py program, which solves the 2D wave equation us = 
c? (Uxx + uyy) with constant wave velocity c and u = 0 on the boundary, to have 
Neumann boundary conditions: du/dn = 0. Include both scalar code (for debug- 
ging and reference) and vectorized code (for speed). 

To test the code, use u = 1.2 as solution (/(x, y) = 1.2, V = f = 0, and 
c arbitrary), which should be exactly reproduced with any mesh as long as the 
stability criterion is satisfied. Another test is to use the plug-shaped pulse in the 
pulse function from Sect. 2.8 and the wave1D_dn_vc.py program. This pulse is 
exactly propagated in 1D if cAt/Ax = 1. Check that also the 2D program can 
propagate this pulse exactly in x direction (cAt/Ax = 1, Ay arbitrary) and y 
direction (cAt/Ay = 1, Ax arbitrary). 
Filename: wave2D_dn. 


Exercise 2.19: Test the efficiency of compiled loops in 3D 

Extend the wave2D_u0.py code and the Cython, Fortran, and C versions to 3D. 
Set up an efficiency experiment to determine the relative efficiency of pure scalar 
Python code, vectorized code, Cython-compiled loops, Fortran-compiled loops, and 
C-compiled loops. Normalize the CPU time for each mesh by the fastest version. 
Filename: wave3D_u0. 
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2.14 Applications of Wave Equations 


This section presents a range of wave equation models for different physical phe- 
nomena. Although many wave motion problems in physics can be modeled by the 
standard linear wave equation, or a similar formulation with a system of first-order 
equations, there are some exceptions. Perhaps the most important is water waves: 
these are modeled by the Laplace equation with time-dependent boundary condi- 
tions at the water surface (long water waves, however, can be approximated by a 
standard wave equation, see Sect. 2.14.7). Quantum mechanical waves constitute 
another example where the waves are governed by the Schrödinger equation, i.e., 
not by a standard wave equation. Many wave phenomena also need to take nonlin- 
ear effects into account when the wave amplitude is significant. Shock waves in the 
air is a primary example. 

The derivations in the following are very brief. Those with a firm background 
in continuum mechanics will probably have enough knowledge to fill in the details, 
while other readers will hopefully get some impression of the physics and approxi- 
mations involved when establishing wave equation models. 


2.14.1 Waves ona String 


Figure 2.10 shows a model we may use to derive the equation for waves on a string. 
The string is modeled as a set of discrete point masses (at mesh points) with elastic 
strings in between. The string has a large constant tension T. We let the mass at 
mesh point x; be m;. The displacement of this mass point in the y direction is 
denoted by u; (t). 

The motion of mass m; is governed by Newton’s second law of motion. The 
position of the mass at time f is x;i + u;(t)j, where i and j are unit vectors in 
the x and y direction, respectively. The acceleration is then u(t) j. Two forces are 


Ti Ti Tia 


Fig. 2.10 Discrete string model with point masses connected by elastic strings 
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acting on the mass as indicated in Fig. 2.10. The force T~ acting toward the point 
x;_; can be decomposed as 


T` =-Tsingi —Tcos@¢j, 


where ¢ is the angle between the force and the line x = x;. Let Au; = ui — uj;_| 


and let As; = (Au? + (x; — x;-1)? be the distance from mass m;_; to mass mj. 
It is seen that cos@ = Au;/As; and sing = (x; — x;-1)/As or Ax/As; if we 
introduce a constant mesh spacing Ax = x; — x;-1. The force can then be written 


Ax, Ay; , 
i-T 


T = -T . 
AS; As; J 


The force T” acting toward x;+1 can be calculated in a similar way: 


A Au; 
Pa ap a 
ASj41 AS +1 
Newton’s second law becomes 
miu! (t) j = T+ +T, 
which gives the component equations 
SUTE ie (2.120) 
Asi Asia? l 
1 AUi+ı Au; 
miu; (t) = T — -T ; (2.121) 
ASi41 AS; 


A basic reasonable assumption for a string is small displacements u; and small 
displacement gradients Au; /Ax. For small g = Au;/Ax we have that 


1 
As; = y Au? + Ax? = Axy 1 +g? + Ax (: +-g + og’) x Ax. 


2 


Equation (2.120) is then simply the identity T = T, while (2.121) can be written as 


AUi+ı Aui 
Ax Ax 


mul (t) =T 
which upon division by Ax and introducing the density ọ; = m; / Ax becomes 
n 1 
Qiu; (t) = T—, (visi — 2u; + uj-4) . (2.122) 
Ax? 


We can now choose to approximate u; by a finite difference in time and get the 
discretized wave equation, 


(uj41 — 2u; + ui). (2.123) 


l L I 


1 n n a= 
ei Ke (upt! — 2u” — u’ 1) = Ta 
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On the other hand, we may go to the continuum limit Ax — 0 and replace u; (ft) 
by u(x,t), a; by o(x), and recognize that the right-hand side of (2.122) approaches 
d?u/dx* as Ax — 0. We end up with the continuous model for waves on a string: 
3u 3u 
a fF. 2.124 
egr ax? ( ) 
Note that the density @ may change along the string, while the tension T is a 
constant. With variable wave velocity c(x) = /T/o(x) we can write the wave 
equation in the more standard form 


u 4 du 
m" (a5 (2.125) 


Because of the way ọ enters the equations, the variable wave velocity does not ap- 
pear inside the derivatives as in many other versions of the wave equation. However, 
most strings of interest have constant ọ. 

The end points of a string are fixed so that the displacement u is zero. The 
boundary conditions are therefore u = 0. 


Damping Air resistance and non-elastic effects in the string will contribute to re- 

duce the amplitudes of the waves so that the motion dies out after some time. This 

damping effect can be modeled by a term bu; on the left-hand side of the equation 

3u ðu 3u 

b— =T ; 2.126 

"e ta ae ney 

The parameter b > 0 is small for most wave phenomena, but the damping effect 
may become significant in long time simulations. 


External forcing It is easy to include an external force acting on the string. Say 
we have a vertical force f; j acting on mass m;, modeling the effect of gravity on 
a string. This force affects the vertical component of Newton’s law and gives rise 
to an extra term f (x,t) on the right-hand side of (2.124). In the model (2.125) we 


would add a term f(x,ft) = f(x,t)/o(x). 


Modeling the tension via springs We assumed, in the derivation above, that the 
tension in the string, 7, was constant. It is easy to check this assumption by 
modeling the string segments between the masses as standard springs, where the 
force (tension 7) is proportional to the elongation of the spring segment. Let k 
be the spring constant, and set T; = kA¢ for the tension in the spring segment 
between x;_, and x;, where A£ is the elongation of this segment from the tension- 
free state. A basic feature of a string is that it has high tension in the equilibrium 
position u = 0. Let the string segment have an elongation Af g in the equilib- 
rium position. After deformation of the string, the elongation is A£ = Afy + As;: 
T; = k(Aly + As;) ~ k(Afp + Ax). This shows that T; is independent of i. 
Moreover, the extra approximate elongation Ax is very small compared to A£ọ, 
so we may well set T; = T = kA€o. This means that the tension is completely 
dominated by the initial tension determined by the tuning of the string. The addi- 
tional deformations of the spring during the vibrations do not introduce significant 
changes in the tension. 
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2.14.2 Elastic Waves in a Rod 


Consider an elastic rod subject to a hammer impact at the end. This experiment will 
give rise to an elastic deformation pulse that travels through the rod. A mathematical 
model for longitudinal waves along an elastic rod starts with the general equation 
for deformations and stresses in an elastic medium, 


ou, =V-ot+of, (2.127) 


where ọ is the density, u the displacement field, ø the stress tensor, and f body 
forces. The latter has normally no impact on elastic waves. 

For stationary deformation of an elastic rod, aligned with the x axis, one has 
that oy, = Eux, with all other stress components being zero. The parameter E 
is known as Young’s modulus. Moreover, we set u = u(x,t)i and neglect the 
radial contraction and expansion (where Poisson’s ratio is the important parameter). 
Assuming that this simple stress and deformation field is a good approximation, 


(2.127) simplifies to 
Pu a p?” (2.128) 
Cae ax əx)” ` 


The associated boundary conditions are u or Oyy = Eu; known, typically u = 0 
for a fixed end and oxx = 0 for a free end. 


2.14.3 Waves on a Membrane 


Think of a thin, elastic membrane with shape as a circle or rectangle. This mem- 
brane can be brought into oscillatory motion and will develop elastic waves. We 
can model this phenomenon somewhat similar to waves in a rod: waves in a mem- 
brane are simply the two-dimensional counterpart. We assume that the material 
is deformed in the z direction only and write the elastic displacement field on the 
form u(x, y,t) = w(x, y,t)i. The z coordinate is omitted since the membrane is 
thin and all properties are taken as constant throughout the thickness. Inserting this 
displacement field in Newton’s 2nd law of motion (2.127) results in 


w ə dw ð ( dw 
= ; 2.129 
e Ja a (#Ge) + (4S) aa 
This is nothing but a wave equation in w(x, y, t), which needs the usual initial con- 
ditions on w and w; as well as a boundary condition w = 0. When computing 
the stress in the membrane, one needs to split o into a constant high-stress compo- 


nent due to the fact that all membranes are normally pre-stressed, plus a component 
proportional to the displacement and governed by the wave motion. 


2.14.4 The Acoustic Model for Seismic Waves 


Seismic waves are used to infer properties of subsurface geological structures. The 
physical model is a heterogeneous elastic medium where sound is propagated by 
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small elastic vibrations. The general mathematical model for deformations in an 
elastic medium is based on Newton’s second law, 


ou, =V-o+of, (2.130) 


and a constitutive law relating ø to u, often Hooke’s generalized law, 
r 2 
o = KV-ul+G{Vu+ (Vu) T ee ; (2.131) 


Here, u is the displacement field, ø is the stress tensor, I is the identity tensor, ọ is 
the medium’s density, f are body forces (such as gravity), K is the medium’s bulk 
modulus and G is the shear modulus. All these quantities may vary in space, while 
u and ø will also show significant variation in time during wave motion. 

The acoustic approximation to elastic waves arises from a basic assumption that 
the second term in Hooke’s law, representing the deformations that give rise to shear 
stresses, can be neglected. This assumption can be interpreted as approximating the 
geological medium by a fluid. Neglecting also the body forces f , (2.130) becomes 


ou, = V(KV - u). (2.132) 
Introducing p as a pressure via 
p=-KV.u, (2.133) 


and dividing (2.132) by ọ, we get 
1 
Un = ale (2.134) 


Taking the divergence of this equation, using V -u = —p/K from (2.133), gives 
the acoustic approximation to elastic waves: 


1 
Pu = KV: (<2) ; (2.135) 


This is a standard, linear wave equation with variable coefficients. Itis common to 
add a source term s(x, y, Z, t) to model the generation of sound waves: 


1 
Pit =Kv-(<vp) +5. (2.136) 


A common additional approximation of (2.136) is based on using the chain rule 
on the right-hand side, 


1 K 1 K 
KV-|-Vp)=—V°p+KV{-—]:-Vp x —V’p, 
Q Q Q Q 
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under the assumption that the relative spatial gradient Vo-! = —o~?Va is small. 
This approximation results in the simplified equation 


K 
Pu = oe (2.137) 


The acoustic approximations to seismic waves are used for sound waves in the 
ground, and the Earth’s surface is then a boundary where p equals the atmospheric 
pressure po such that the boundary condition becomes p = po. 


Anisotropy Quite often in geological materials, the effective wave velocity c = 
4y K /o is different in different spatial directions because geological layers are com- 
pacted, and often twisted, in such a way that the properties in the horizontal and 
vertical direction differ. With z as the vertical coordinate, we can introduce a ver- 
tical wave velocity c, and a horizontal wave velocity c}, and generalize (2.137) 
to 

Pit = C2 Daz + Cy (Pax + Pyy) +5. (2.138) 


2.14.5 Sound Waves in Liquids and Gases 


Sound waves arise from pressure and density variations in fluids. The starting point 
of modeling sound waves is the basic equations for a compressible fluid where we 
omit viscous (frictional) forces, body forces (gravity, for instance), and temperature 
effects: 


0, + V- (ou) = 0, (2.139) 
ou, + ou -Vu = —V p, (2.140) 
o = o(p). (2.141) 


These equations are often referred to as the Euler equations for the motion of a 
fluid. The parameters involved are the density ọ, the velocity u, and the pressure p. 
Equation (2.139) reflects mass balance, (2.140) is Newton’s second law for a fluid, 
with frictional and body forces omitted, and (2.141) is a constitutive law relating 
density to pressure by thermodynamic considerations. A typical model for (2.141) 
is the so-called isentropic relation!$, valid for adiabatic processes where there is no 


heat transfer: 
p 1/y 
0 = 00 (4) : (2.142) 
Po 


Here, po and Qo are reference values for p and ọ when the fluid is at rest, and y 
is the ratio of specific heat at constant pressure and constant volume (y = 5/3 for 
air). 

The key approximation in a mathematical model for sound waves is to assume 
that these waves are small perturbations to the density, pressure, and velocity. We 


15 http://en.wikipedia.org/wiki/Isentropic_process 
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therefore write 


P= Po + P, 
0 = Qo + Ô, 
u = û, 


where we have decomposed the fields in a constant equilibrium value, correspond- 
ing to u = 0, and a small perturbation marked with a hat symbol. By inserting 
these decompositions in (2.139) and (2.140), neglecting all product terms of small 
perturbations and/or their derivatives, and dropping the hat symbols, one gets the 
following linearized PDE system for the small perturbations in density, pressure, 
and velocity: 


Or + 00V -u = 0, (2.143) 
Qol; = —V p. (2.144) 


Now we can eliminate ọ; by differentiating the relation o(p), 


1/p\” 1 oo (p \ IT 
Or = 007 (2) Pt = ( ) Pt. 
Y \ Po Po ypo \ Po 


The product term p!/’~! p; can be linearized as Po! i Pr, resulting in 


20 


Qt X — Pr. 
yY Po 
We then get 
Pi + YPoV -u = 0, (2.145) 
1 
u, = —— Vp. (2.146) 


Q0 


Taking the divergence of (2.146) and differentiating (2.145) with respect to time 
gives the possibility to easily eliminate V - u; and arrive at a standard, linear wave 
equation for p: 

Pu = c?V" p, (2.147) 


where c = i YPo/ Qo is the speed of sound in the fluid. 


2.14.6 Spherical Waves 


Spherically symmetric three-dimensional waves propagate in the radial direction r 
only so that u = u(r,t). The fully three-dimensional wave equation 
3u 


then reduces to the spherically symmetric wave equation 


Pu 13/3 
ay (c (rr =) + f(r,t), re(0,R),t>0. (2.148) 
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One can easily show that the function v(r,t) = ru(r,t) fulfills a standard wave 
equation in Cartesian coordinates if c is constant. To this end, insert u = v/r in 


Ld fz, yeu 
a7 (¢ (r)r =) 


d? ðv oH 0\ de? 
r c = v 
dr ðr or? dr 


The two terms in the parenthesis can be combined to 


to obtain 


which is recognized as the variable-coefficient Laplace operator in one Cartesian 
coordinate. The spherically symmetric wave equation in terms of u(r, ft) now be- 
comes 


v 98 (5, ðv 1 de? 
PA (e oe) eae v+rf(r,t), re(0,R),t>0. (2.149) 


In the case of constant wave velocity c, this equation reduces to the wave equation 
in a single Cartesian coordinate called r: 


2 2 
wy OCS rfen, r € (0, R), t>0. (2.150) 
That is, any program for solving the one-dimensional wave equation in a Cartesian 
coordinate system can be used to solve (2.150), provided the source term is multi- 
plied by the coordinate, and that we divide the Cartesian mesh solution by r to get 
the spherically symmetric solution. Moreover, if r = 0 is included in the domain, 
spherical symmetry demands that ðu/ðr = 0 at r = 0, which means that 


ðu 1 dv 
ap og ‘oe =; r=0. 


For this to hold in the limit r — 0, we must have v(0, t) = 0 at least as a necessary 
condition. In most practical applications, we exclude r = 0 from the domain and 
assume that some boundary condition is assigned at r = €, for some € > 0. 


2.14.7 The Linear Shallow Water Equations 


The next example considers water waves whose wavelengths are much larger than 
the depth and whose wave amplitudes are small. This class of waves may be gen- 
erated by catastrophic geophysical events, such as earthquakes at the sea bottom, 
landslides moving into water, or underwater slides (or a combination, as earth- 
quakes frequently release avalanches of masses). For example, a subsea earthquake 
will normally have an extension of many kilometers but lift the water only a few 
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meters. The wave length will have a size dictated by the earthquake area, which is 
much lager than the water depth, and compared to this wave length, an amplitude of 
a few meters is very small. The water is essentially a thin film, and mathematically 
we can average the problem in the vertical direction and approximate the 3D wave 
phenomenon by 2D PDEs. Instead of a moving water domain in three space di- 
mensions, we get a horizontal 2D domain with an unknown function for the surface 
elevation and the water depth as a variable coefficient in the PDEs. 

Let n(x, y,t) be the elevation of the water surface, H(x, y) the water depth 
corresponding to a flat surface (7 = 0), u(x, y, t) and v(x, y, t) the depth-averaged 
horizontal velocities of the water. Mass and momentum balance of the water volume 
give rise to the PDEs involving these quantities: 


n = —(Hu), — (Hv), (2.151) 
ur = —§Nx, (2.152) 
Vv; = —8Ny, (2.153) 


where g is the acceleration of gravity. Equation (2.151) corresponds to mass bal- 
ance while the other two are derived from momentum balance (Newton’s second 
law). 

The initial conditions associated with (2.151)—(2.153) are 7, u, and v prescribed 
att = 0. A common condition is to have some water elevation n = I(x, y) 
and assume that the surface is at rest: u = v = 0. A subsea earthquake usually 
means a sufficiently rapid motion of the bottom and the water volume to say that 
the bottom deformation is mirrored at the water surface as an initial lift 7(x, y) and 
that u = v = 0. 

Boundary conditions may be 7 prescribed for incoming, known waves, or zero 
normal velocity at reflecting boundaries (steep mountains, for instance): un, + 
vn, = 0, where (nx,ny) is the outward unit normal to the boundary. More so- 
phisticated boundary conditions are needed when waves run up at the shore, and 
at open boundaries where we want the waves to leave the computational domain 
undisturbed. 

Equations (2.151), (2.152), and (2.153) can be transformed to a standard, linear 
wave equation. First, multiply (2.152) and (2.153) by H , differentiate (2.152) with 
respect to x and (2.153) with respect to y. Second, differentiate (2.151) with respect 
to ¢ and use that (Hu), = (Hu;), and (Hv),,; = (Hv,), when H is independent 
of t. Third, eliminate (Hu,), and (Hv,), with the aid of the other two differenti- 
ated equations. These manipulations result in a standard, linear wave equation for 
n: 

Mr = (Hnx)x + (gHny)y = V- (€H Vn). (2.154) 


In the case we have an initial non-flat water surface at rest, the initial conditions 
become ņ = I(x, y) and n; = 0. The latter follows from (2.151) if u = v = 0, or 
simply from the fact that the vertical velocity of the surface is n+, which is zero for 
a surface at rest. 

The system (2.151)-(2.153) can be extended to handle a time-varying bottom 
topography, which is relevant for modeling long waves generated by underwater 
slides. In such cases the water depth function H is also a function of t, due to the 
moving slide, and one must add a time-derivative term H, to the left-hand side of 
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(2.151). A moving bottom is best described by introducing z = Hp as the still- 
water level, z = B(x, y,t) as the time- and space-varying bottom topography, so 
that H = Hy — B(x, y,t). In the elimination of u and v one may assume that the 
dependence of H ont can be neglected in the terms (Hu),,; and (Hv),,. We then 
end up with a source term in (2.154), because of the moving (accelerating) bottom: 


Nr = V- (ghVn) + Bu. (2.155) 


The reduction of (2.155) to 1D, for long waves in a straight channel, or for 
approximately plane waves in the ocean, is trivial by assuming no change in y 
direction (0/dy = 0): 

Ne = (gH yx)x + Br. (2.156) 


Wind drag on the surface Surface waves are influenced by the drag of the wind, 
and if the wind velocity some meters above the surface is (U, V), the wind drag 
gives contributions Cy VU? + V2U and Cy VU? + V2V to (2.152) and (2.153), 
respectively, on the right-hand sides. 


Bottom drag The waves will experience a drag from the bottom, often roughly 
modeled by a term similar to the wind drag: Cg Vu? + v2u on the right-hand side 
of (2.152) and Cg Vu? + v2v on the right-hand side of (2.153). Note that in this 
case the PDEs (2.152) and (2.153) become nonlinear and the elimination of u and 
v to arrive at a 2nd-order wave equation for 7 is not possible anymore. 


Effect of the Earth’s rotation Long geophysical waves will often be affected by 
the rotation of the Earth because of the Coriolis force. This force gives rise to a term 
fv on the right-hand side of (2.152) and — f u on the right-hand side of (2.153). 
Also in this case one cannot eliminate u and v to work with a single equation for 
n. The Coriolis parameter is f = 22 sing, where £2 is the angular velocity of the 
earth and ¢ is the latitude. 


2.14.8 Waves in Blood Vessels 


The flow of blood in our bodies is basically fluid flow in a network of pipes. Unlike 
rigid pipes, the walls in the blood vessels are elastic and will increase their diameter 
when the pressure rises. The elastic forces will then push the wall back and accel- 
erate the fluid. This interaction between the flow of blood and the deformation of 
the vessel wall results in waves traveling along our blood vessels. 

A model for one-dimensional waves along blood vessels can be derived from 
averaging the fluid flow over the cross section of the blood vessels. Let x be a coor- 
dinate along the blood vessel and assume that all cross sections are circular, though 
with different radii R(x, t). The main quantities to compute is the cross section area 
A(x, t), the averaged pressure P (x,t), and the total volume flux Q (x,t). The area 
of this cross section is 

R(x,t) 


A(x,t) = 20 f rdr. (2.157) 
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Let v,.(x, t) be the velocity of blood averaged over the cross section at point x. The 
volume flux, being the total volume of blood passing a cross section per time unit, 
becomes 

O(x,t) = A(x,t)v,(x,f). (2.158) 


Mass balance and Newton’s second law lead to the PDEs 


Aa 
° a ax Q =0, (2.159) 
Ox 
ðQ y+20 a. A ðP BE, 
= —2r 2)—= 2.1 
ðt t Q 0x (y+ Ms rsen 


where y is a parameter related to the velocity profile, @ is the density of blood, and 
H is the dynamic viscosity of blood. 

We have three unknowns A, Q, and P, and two equations (2.159) and (2.160). 
A third equation is needed to relate the flow to the deformations of the wall. A 
common form for this equation is 


ek = Sy, (2.161) 


where C is the compliance of the wall, given by the constitutive relation 


ðA OA 
C= — + >, 
ot 


(2.162) 
which requires a relationship between A and P. One common model is to view the 
vessel wall, locally, as a thin elastic tube subject to an internal pressure. This gives 


the relation hE 
th 
P=P (VA- Vo). 
ot eas 
where Po and Apo are corresponding reference values when the wall is not deformed, 
h is the thickness of the wall, and E£ and v are Young’s modulus and Poisson’s ratio 
of the elastic material in the wall. The derivative becomes 


c= 


ðA 2(1— v) Ao 7 (1 — v?) Ao 
ƏP —s thE (S thE 


| (P — Po). (2.163) 


Another (nonlinear) deformation model of the wall, which has a better fit with ex- 
periments, is 
P = Poexp ((A4/Ao — 1)), 


where f is some parameter to be estimated. This law leads to 


ðA A 
C=—=—,. (2.164) 
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Reduction to the standard wave equation It is not uncommon to neglect the 
viscous term on the right-hand side of (2.160) and also the quadratic term with Q? 
on the left-hand side. The reduced equations (2.160) and (2.161) form a first-order 
linear wave equation system: 


es aye 2.1 
. ot ax’ oe) 
dQ A ðP 
NO nae 2.1 
ot o Ox eno) 


These can be combined into standard 1D wave PDE by differentiating the first equa- 
tion with respect to ¢ and the second with respect to x, 


ð (c?P\ _ a (AP 
ðt at) ax \e dx)’ 


which can be approximated by 


20 480 A 
ae gas 2.167 
ae age oC ( ) 


where the A and C in the expression for c are taken as constant reference values. 


2.14.9 Electromagnetic Waves 


Light and radio waves are governed by standard wave equations arising from 
Maxwell’s general equations. When there are no charges and no currents, as in a 
vacuum, Maxwell’s equations take the form 


V-E=0, 
V-B=0, 
aB 
V x E=-—, 
i ot 
TXB OE 
x = ——i 
Hooy > 


where €) = 8.854187817620 - 10~'? (F/m) is the permittivity of free space, also 
known as the electric constant, and jy = 1.2566370614- 1076 (H/m) is the perme- 
ability of free space, also known as the magnetic constant. Taking the curl of the 
two last equations and using the mathematical identity 


V x(V x E) =V(V-E)-WE =-V’E when V -E =0, 


gives the wave equation governing the electric and magnetic field: 


3E 
m CWE, (2.168) 
0B 

= VB, (2.169) 


are 
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with c = 1/,/j£0€ as the velocity of light. Each component of E and B fulfills a 
wave equation and can hence be solved independently. 
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Exercise 2.20: Simulate waves on a non-homogeneous string 

Simulate waves on a string that consists of two materials with different density. The 
tension in the string is constant, but the density has a jump at the middle of the 
string. Experiment with different sizes of the jump and produce animations that 
visualize the effect of the jump on the wave motion. 


Hint According to Sect. 2.14.1, the density enters the mathematical model as ọ in 
Quy, = Tuxx, where T is the string tension. Modify, e.g., the wave1D_u0v. py 
code to incorporate the tension and two density values. Make a mesh function 
rho with density values at each spatial mesh point. A value for the tension may 
be 150 N. Corresponding density values can be computed from the wave velocity 
estimations in the guitar function in the wave1D_u0v. py file. 

Filename: wave1D_u0_sv_discont. 


Exercise 2.21: Simulate damped waves on a string 

Formulate a mathematical model for damped waves on a string. Use data from 
Sect. 2.3.6, and tune the damping parameter so that the string is very close to the 
rest state after 15 s. Make a movie of the wave motion. 

Filename: wave1D_u0_sv_damping. 


Exercise 2.22: Simulate elastic waves in a rod 

A hammer hits the end of an elastic rod. The exercise is to simulate the resulting 
wave motion using the model (2.128) from Sect. 2.14.2. Let the rod have length L 
and let the boundary x = L be stress free so that oy, = 0, implying that du/dx = 
0. The left end x = 0 is subject to a strong stress pulse (the hammer), modeled as 


S, 0<t<t,, 
Gantt) = 0 t>t 

The corresponding condition on u becomes ux = S/E fort < t, and zero af- 
terwards (recall that Oxx = Eux). This is a non-homogeneous Neumann condition, 
and you will need to approximate this condition and combine it with the scheme (the 
ideas and manipulations follow closely the handling of a non-zero initial condition 
u, = V in wave PDEs or the corresponding second-order ODEs for vibrations). 
Filename: wave_rod. 


Exercise 2.23: Simulate spherical waves 

Implement a model for spherically symmetric waves using the method described 
in Sect. 2.14.6. The boundary condition at r = 0 must be du/dr = 0, while the 
condition at r = R can either be u = O or a radiation condition as described in 
Problem 2.12. The u = 0 condition is sufficient if R is so large that the amplitude 
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Fig. 2.11 Sketch of initial water surface due to a subsea earthquake 


of the spherical wave has become insignificant. Make movie(s) of the case where 
the source term is located around r = 0 and sends out pulses 


O exp (-s52) sinwt, sinwt >0 
F(r,t) = | 2Ar 


0, sinwt <0 
Here, Q and w are constants to be chosen. 


Hint Use the program wave1D_u0v.py as a starting point. Let solver compute 
the v function and then set u = v/r. However, u = v/r for r = 0 requires special 
treatment. One possibility is to compute u[1:] = v[1:]/r[1:] and then set 
u[0]=u[1]. The latter makes it evident that du/dr = 0 in a plot. 

Filename: wave1D_spherical. 


Problem 2.24: Earthquake-generated tsunami over a subsea hill 

A subsea earthquake leads to an immediate lift of the water surface, see Fig. 2.11. 
The lifted water surface splits into two tsunamis, one traveling to the right and one 
to the left, as depicted in Fig. 2.12. Since tsunamis are normally very long waves, 
compared to the depth, with a small amplitude, compared to the wave length, a 
standard wave equation is relevant: 


Na = (EH (x)nx)x, 


where y is the elevation of the water surface, g is the acceleration of gravity, and 
A(x) is the still water depth. 

To simulate the right-going tsunami, we can impose a symmetry boundary at 
x = 0: dn/dx = 0. We then simulate the wave motion in [0, L]. Unless the ocean 
ends at x = L, the waves should travel undisturbed through the boundary x = L. 
A radiation condition as explained in Problem 2.12 can be used for this purpose. 
Alternatively, one can just stop the simulations before the wave hits the boundary 
at x = L. In that case it does not matter what kind of boundary condition we use at 
x = L. Imposing n = 0 and stopping the simulations when |n; | > €, i = Ny — 1, 
is a possibility (€ is a small parameter). 
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Fig. 2.12 An initial surface elevation is split into two waves 


T(z) 


Fig. 2.13 Sketch of an earthquake-generated tsunami passing over a subsea hill 


The shape of the initial surface can be taken as a Gaussian function, 


x— Ín E 
I(x; Io, Ia, Im, Is) = To + Ia exp | — ’ (2.170) 


with Im = 0 reflecting the location of the peak of (x) and J, being a measure of 
the width of the function /(x) (I, is V2 times the standard deviation of the familiar 
normal distribution curve). 

Now we extend the problem with a hill at the sea bottom, see Fig. 2.13. The 
wave speed c = af, gH(x) = af g(Ho — B(x)) will then be reduced in the shallow 
water above the hill. 

One possible form of the hill is a Gaussian function, 


— Bn 
B(x; Bo, Ba, Bm, Bs) = Bo + Ba exp (- (= B ) (2.171) 


but many other shapes are also possible, e.g., a "cosine hat" where 
x— Bn 
B(x; Bo, Ba, Bm, Bs) = Bo + Ba cos ar ; (2.172) 


when x € [Bm — Bs, Bm + Bs] while B = Bo outside this interval. 
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Also an abrupt construction may be tried: 
B(x; Bo, Ba, Bm, B;) = Bo + Ba, (2.173) 


for x € [Bm — Bs, By + Bs] while B = Bo outside this interval. 

The wave1D_dn_vc.py program can be used as starting point for the implemen- 
tation. Visualize both the bottom topography and the water surface elevation in the 
same plot. Allow for a flexible choice of bottom shape: (2.171), (2.172), (2.173), 
or B(x) = Bo (flat). 

The purpose of this problem is to explore the quality of the numerical solution n? 
for different shapes of the bottom obstruction. The “cosine hat” and the box-shaped 
hills have abrupt changes in the derivative of H(x) and are more likely to generate 
numerical noise than the smooth Gaussian shape of the hill. Investigate if this is 
true. 

Filename: tsunamiiD_hill. 


Problem 2.25: Earthquake-generated tsunami over a 3D hill 
This problem extends Problem 2.24 to a three-dimensional wave phenomenon, gov- 
erned by the 2D PDE 


Mt = (GH Nx)x + (gHny)y = V-(gHVn). (2.174) 


We assume that the earthquake arises from a fault along the line x = 0 in the xy- 
plane so that the initial lift of the surface can be taken as 7 (x) in Problem 2.24. That 
is, a plane wave is propagating to the right, but will experience bending because of 
the bottom. 

The bottom shape is now a function of x and y. An “elliptic” Gaussian function 
in two dimensions, with its peak at (Bmx, By), generalizes (2.171): 


x — Bmx 2 Y — Bmy 
B = By + B, exp -( 3 =) -( F 2) i (2.175) 


where b is a scaling parameter: b = 1 gives a circular Gaussian function with 
circular contour lines, while b # 1 gives an elliptic shape with elliptic contour 
lines. To indicate the input parameters in the model, we may write 


B = B(x; Bo, Ba, Bmx» Bmys Bs, b) : 


The “cosine hat” (2.172) can also be generalized to 


B B+B X= Bmx y= Biny (2 176) 
= a COS | 7 ———— ] COS | 7 —_—— h . 
o 2B, 2B, 


when 0 < yx? + y? < B, and B = Bo outside this circle. 
A box-shaped obstacle means that 


B(x; Bo, Ba, Bm, Bs, b) = Bo + Ba (2.177) 
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for x and y inside a rectangle 
Bmx = B, S25 Bmx ag Bs, Bmy = bB, = y < Biny F bB,, 


and B = Bo outside this rectangle. The b parameter controls the rectangular shape 
of the cross section of the box. 

Note that the initial condition and the listed bottom shapes are symmetric around 
the line y = B,,,. We therefore expect the surface elevation also to be symmetric 
with respect to this line. This means that we can halve the computational domain 
by working with [0, Lx] x [0, Bmy]. Along the upper boundary, y = Bny, we 
must impose the symmetry condition dy/dn = 0. Such a symmetry condition 
(—nx = 0) is also needed at the x = 0 boundary because the initial condition has 
a symmetry here. At the lower boundary y = 0 we also set a Neumann condition 
(which becomes —7, = 0). The wave motion is to be simulated until the wave hits 
the reflecting boundaries where 07/dn = nx = O (one can also set n = 0 - the 
particular condition does not matter as long as the simulation is stopped before the 
wave is influenced by the boundary condition). 

Visualize the surface elevation. Investigate how different hill shapes, different 
sizes of the water gap above the hill, and different resolutions Ax = Ay = h and 
At influence the numerical quality of the solution. 

Filename: tsunami2D_hill. 


Problem 2.26: Investigate Mayavi for visualization 

Play with Mayavi code for visualizing 2D solutions of the wave equation with vari- 
able wave velocity. See if there are effective ways to visualize both the solution and 
the wave velocity scalar field at the same time. 

Filename: tsunami2D_hill_mlab. 


Problem 2.27: Investigate visualization packages 

Create some fancy 3D visualization of the water waves and the subsea hill in 
Problem 2.25. Try to make the hill transparent. Possible visualization tools are 
Mayavi!®, Paraview!’, and OpenDX'®. 

Filename: tsunami2D_hill_viz. 


Problem 2.28: Implement loops in compiled languages 

Extend the program from Problem 2.25 such that the loops over mesh points, inside 
the time loop, are implemented in compiled languages. Consider implementations 
in Cython, Fortran via f2py, C via Cython, C via f2py, C/C++ via Instant, and 
C/C++ via scipy . weave. Perform efficiency experiments to investigate the relative 
performance of the various implementations. It is often advantageous to normalize 
CPU times by the fastest method on a given mesh. 

Filename: tsunami2D_hill_compiled. 


16 http://code.enthought.com/projects/mayavi/ 
17 http://www.paraview.org/ 
18 http:/Avww.opendx.org/ 
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Exercise 2.29: Simulate seismic waves in 2D 

The goal of this exercise is to simulate seismic waves using the PDE model (2.138) 
in a 2D xz domain with geological layers. Introduce m horizontal layers of thick- 
ness h;,i = 0,...,m— 1. Inside layer number i we have a vertical wave velocity 
Cz; and a horizontal wave velocity c;,;. Make a program for simulating such 2D 
waves. Test it on a case with 3 layers where 


C20 = C21 = C22, Cho =Ch2, Chi K Cho- 


Let s be a localized point source at the middle of the Earth’s surface (the upper 
boundary) and investigate how the resulting wave travels through the medium. The 
source can be a localized Gaussian peak that oscillates in time for some time inter- 
val. Place the boundaries far enough from the expanding wave so that the boundary 
conditions do not disturb the wave. Then the type of boundary condition does not 
matter, except that we physically need to have p = po, where po is the atmospheric 
pressure, at the upper boundary. 

Filename: seismic2D. 


Project 2.30: Model 3D acoustic waves in a room 
The equation for sound waves in air is derived in Sect. 2.14.5 and reads 


Ptt = eV? p, 


where p(x, y,Z,t) is the pressure and c is the speed of sound, taken as 340 m/s. 
However, sound is absorbed in the air due to relaxation of molecules in the gas. 
A model for simple relaxation, valid for gases consisting only of one type of 
molecules, is a term c?1, V? p: in the PDE, where ts is the relaxation time. If we 
generate sound from, e.g., a loudspeaker in the room, this sound source must also 
be added to the governing equation. 

The PDE with the mentioned type of damping and source then becomes 


pit = N? + CGV? p: + f, (2.178) 


where f(x, y, Z,t) is the source term. 

The walls can absorb some sound. A possible model is to have a “wall layer” 
(thicker than the physical wall) outside the room where c is changed such that some 
of the wave energy is reflected and some is absorbed in the wall. The absorption of 
energy can be taken care of by adding a damping term bp, in the equation: 


Pit + bpp = CV? + PLV? pi + f. (2.179) 


Typically, b = 0 in the room and b > 0 in the wall. A discontinuity in b or c 
will give rise to reflections. It can be wise to use a constant c in the wall to control 
reflections because of the discontinuity between c in the air and in the wall, while 
b is gradually increased as we go into the wall to avoid reflections because of rapid 
changes in b. At the outer boundary of the wall the condition p = 0 or dp/dn = 0 
can be imposed. The waves should anyway be approximately dampened to p = 0 
this far out in the wall layer. 
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There are two strategies for discretizing the V? p, term: using a center difference 
between times n + 1 and n — 1 (if the equation is sampled at level n), or use a 
one-sided difference based on levels n and n — 1. The latter has the advantage of 
not leading to any equation system, while the former is second-order accurate as the 
scheme for the simple wave equation p,;t = c*V*p. To avoid an equation system, 
go for the one-sided difference such that the overall scheme becomes explicit and 
only of first order in time. 

Develop a 3D solver for the specified PDE and introduce a wall layer. Test 
the solver with the method of manufactured solutions. Make some demonstrations 
where the wall reflects and absorbs the waves (reflection because of discontinuity 
in b and absorption because of growing b). Experiment with the impact of the Ts 
parameter. 

Filename: acoustics. 


Project 2.31: Solve a 1D transport equation 
We shall study the wave equation 


u, +cux =0, x €(0,L], t € (0,T], (2.180) 
with initial condition 
u(x,0) = I(x), xe [0, L], (2.181) 


and one periodic boundary condition 
u(0,t) = u(L,t). (2.182) 


This boundary condition means that what goes out of the domain at x = L comes 
in at x = 0. Roughly speaking, we need only one boundary condition because the 
spatial derivative is of first order only. 


Physical interpretation The parameter c can be constant or variable, c = c(x). 
The equation (2.180) arises in transport problems where a quantity u, which could 
be temperature or concentration of some contaminant, is transported with the veloc- 
ity c of a fluid. In addition to the transport imposed by “travelling with the fluid”, u 
may also be transported by diffusion (such as heat conduction or Fickian diffusion), 
but we have in the model u, + cu, assumed that diffusion effects are negligible, 
which they often are. 


a) Show that under the assumption of a = const, 

u(x,t) = I(x —ct) (2.183) 
fulfills the PDE as well as the initial and boundary condition (provided /(0) = 
I(L)). 

A widely used numerical scheme for (2.180) applies a forward difference in 
time and a backward difference in space when c > 0: 


[Diu + cDzu = J". (2.184) 


For c < 0 we use a forward difference in space: [cD tu]. 
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b) Set up a computational algorithm and implement it in a function. Assume a is 
constant and positive. 

c) Test the implementation by using the remarkable property that the numerical 
solution is exact at the mesh points if At = c7! Ax. 

d) Make a movie comparing the numerical and exact solution for the following two 
choices of initial conditions: 


i= [sin (x =|" (2.185) 


where n is an integer, typically n = 5, and 


_ 2 
ae (2.186) 


I(x) = exp (-‘ 563 


Choose At = c! Ax, 0.9c7! Ax, 0.5e7! Ax. 

e) The performance of the suggested numerical scheme can be investigated by an- 
alyzing the numerical dispersion relation. Analytically, we have that the Fourier 
component 

u(x, t) = ei Exot) 

is a solution of the PDE if œ = kc. This is the analytical dispersion relation. 

A complete solution of the PDE can be built by adding up such Fourier com- 

ponents with different amplitudes, where the initial condition Z determines the 

amplitudes. The solution u is then represented by a Fourier series. 

A similar discrete Fourier component at (xp, tn) is 

u4 — ei KpPAx—őnAt) 

where in general @ is a function of k, At, and Ax, and differs from the exact 

w = kc. 

Insert the discrete Fourier component in the numerical scheme and derive an 

expression for @, i.e., the discrete dispersion relation. Show in particular that 

if At/(cAx) = 1, the discrete solution coincides with the exact solution at the 
mesh points, regardless of the mesh resolution (!). Show that if the stability 


condition 
At 


3 < 
cAx ~ 
the discrete Fourier component cannot grow (i.e., @ is real). 
f) Write a test for your implementation where you try to use information from the 
numerical dispersion relation. 
We shall hereafter assume that c(x) > 0. 
g) Set up a computational algorithm for the variable coefficient case and implement 
it in a function. Make a test that the function works for constant a. 
h) It can be shown that for an observer moving with velocity c(x), u is constant. 
This can be used to derive an exact solution when a varies with x. Show first 
that 


’ 


u(x,t) = f(C(x) — 1), (2.187) 
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i) 


where i 
C'(x) = —, 
(x) a 
is a solution of (2.180) for any differentiable function f. 
Use the initial condition to show that an exact solution is 


u(x,t) = (CC — 1), 


with C7! being the inverse function of C = f c!dx. Since C(x) is an integral 
a /c)dx, C(x) is monotonically increasing and there exists hence an inverse 
function C~! with values in [0, L]. 

To compute (2.187) we need to integrate 1/c to obtain C and then compute the 
inverse of C. 

The inverse function computation can be easily done if we first think discretely. 
Say we have some function y = g(x) and seek its inverse. Plotting (x;, y;), 
where y; = g(x;) for some mesh points x;, displays g as a function of x. The 
inverse function is simply x as a function of g, i.e., the curve with points (y;, x;). 
We can therefore quickly compute points at the curve of the inverse function. 
One way of extending these points to a continuous function is to assume a lin- 
ear variation (known as linear interpolation) between the points (which actually 
means to draw straight lines between the points, exactly as done by a plotting 
program). 

The function wrap2callable in scitools.std can take a set of points and 
return a continuous function that corresponds to linear variation between the 
points. The computation of the inverse of a function g on [0, L] can then be 
done by 


def inverse(g, domain, resolution=101): 
x = linspace(domain[0], domain[L], resolution) 
y = g(x) 
from scitools.std import wrap2callable 
g_inverse = wrap2callable((y, x)) 
return g_inverse 


To compute C(x) we need to integrate 1/c, which can be done by a Trapezoidal 
rule. Suppose we have computed C(x;) and need to compute C(x;+1). Using 
the Trapezoidal rule with m subintervals over the integration domain [x;, x; +4] 
gives 


x; 
itl m-—l1 


dx 1 1 1 1 1 
Cand =c f Zens ae) E 27 
(2.188) 
where h = (x;4, — x;)/m is the length of the subintervals used for the inte- 
gral over [x;,; +41]. We observe that (2.188) is a difference equation which we 
can solve by repeatedly applying (2.188) fori = 0,1,...,N, — 1 if a mesh 
Xo,X,...,Xy, is prescribed. Note that C(0) = 0. 


j=l 


X 


204 2 Wave Equations 


j) Implement a function for computing C(x;) and one for computing C~!(x) for 
any x. Use these two functions for computing the exact solution J(C~! (C(x) — 
t)). End up with a function u_exact_variable_c(x, n, c, I) that returns 
the value of [(C~!(C(x) — t,)). 

k) Make movies showing a comparison of the numerical and exact solutions for the 
two initial conditions (2.185) and (2.15). Choose At = Ax/ maxo,, c(x) and 
the velocity of the medium as 
(a) c(x) = 1+ esin(krx/L),e <1, 

(b) c(x) = 1 + I(x), where J is given by (2.185) or (2.15). 
The PDE u; + cu, = 0 expresses that the initial condition /(x) is transported 
with velocity c(x). 


Filename: adveciD. 


Problem 2.32: General analytical solution of a 1D damped wave equation 
We consider an initial-boundary value problem for the damped wave equation: 


Ur + bu; = cuss, x €(0,L), t € (0, T] 


u(0,t) = 0, 
u(L,t)= 0, 
u(x,0) = I(x), 


u,(x,0) = V(x). 


Here, b > 0 and c are given constants. The aim is to derive a general analytical 
solution of this problem. Familiarity with the method of separation of variables for 
solving PDEs will be assumed. 


a) Seek a solution on the form u(x,t) = X(x)T (t). Insert this solution in the PDE 
and show that it leads to two differential equations for X and T: 


T’+b7T' +AT =0, c7?X"+AX =0, 


with X(0) = X(L) = 0 as boundary conditions, and À as a constant to be 
determined. 


b) Show that X(x) is on the form 
X,(x) = C sinkx, k= =, [a Ue ass 


where C,, is an arbitrary constant. 
c) Under the assumption that (b/2)? < k?, show that T(t) is on the form 


1 
T,(t) = en 24 (an coswt + b sinwt), w= |k? — no wS 2 


The complete solution is then 


CO 
1 
u(x,t) = X sin kxe™?” (An coswt + B, sinat), 


n=1 


where the constants A, and B, must be computed from the initial conditions. 
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d) Derive a formula for A, from u(x,0) = I(x) and developing I(x) as a sine 
Fourier series on [0, L]. 

e) Derive a formula for B, from u,(x,0) = V(x) and developing V(x) as a sine 
Fourier series on [0, L]. 

f) Calculate A, and B,, from vibrations of a string where V(x) = 0 and 


ax/Xo, x < X09, 


i (2.189) 
a(L — x)/(L — xo), otherwise . 


I(x) = 


g) Implement a function u_series(x, t, tol=1E-10) for the series for u(x, t), 
where tol is a tolerance for truncating the series. Simply sum the terms until 
|a,| and |b,| both are less than tol. 

h) What will change in the derivation of the analytical solution if we have 
u,(0,t) = u,(L,t) = 0 as boundary conditions? And how will you solve 
the problem with u(0,t) = 0 and u,(L,t) = 0? 


Filename: damped_wave1D. 


Problem 2.33: General analytical solution of a 2D damped wave equation 
Carry out Problem 2.32 in the 2D case: up, + bu, = c? (Uxx + uyy), Where (x, y) € 
(0, Lx) x (0, Ly). Assume a solution on the form u(x, y, t) = X(x)Y (y)T (£). 
Filename: damped_wave2D. 
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Diffusion Equations 


The famous diffusion equation, also known as the heat equation, reads 


du 3u 

at Ox? 
where u(x,t) is the unknown function to be solved for, x is a coordinate in space, 
and ¢ is time. The coefficient « is the diffusion coefficient and determines how fast 
u changes in time. A quick short form for the diffusion equation is u; = &uxx. 

Compared to the wave equation, Us = C?Uxx, which looks very similar, the 
diffusion equation features solutions that are very different from those of the wave 
equation. Also, the diffusion equation makes quite different demands to the numer- 
ical methods. 

Typical diffusion problems may experience rapid change in the very beginning, 
but then the evolution of u becomes slower and slower. The solution is usually very 
smooth, and after some time, one cannot recognize the initial shape of u. This is in 
sharp contrast to solutions of the wave equation where the initial shape is preserved 
in homogeneous media — the solution is then basically a moving initial condition. 
The standard wave equation us = c?uxx has solutions that propagate with speed 
c forever, without changing shape, while the diffusion equation converges to a sta- 
tionary solution u(x) as t — oo. In this limit, u; = 0, and ù is governed by 
u(x) = 0. This stationary limit of the diffusion equation is called the Laplace 
equation and arises in a very wide range of applications throughout the sciences. 

It is possible to solve for u(x, t) using an explicit scheme, as we do in Sect. 3.1, 
but the time step restrictions soon become much less favorable than for an explicit 
scheme applied to the wave equation. And of more importance, since the solution 
u of the diffusion equation is very smooth and changes slowly, small time steps are 
not convenient and not required by accuracy as the diffusion process converges to a 
stationary state. Therefore, implicit schemes (as described in Sect. 3.2) are popular, 
but these require solutions of systems of algebraic equations. We shall use ready- 
made software for this purpose, but also program some simple iterative methods. 
The exposition is, as usual in this book, very basic and focuses on the basic ideas 
and how to implement. More comprehensive mathematical treatments and classical 
analysis of the methods are found in lots of textbooks. A favorite of ours in this 
respect is the one by LeVeque [13]. The books by Strikwerda [17] and by Lapidus 
and Pinder [12] are also highly recommended as additional material on the topic. 
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3.1 An Explicit Method for the 1D Diffusion Equation 


Explicit finite difference methods for the wave equation us = C?uxx can be used, 
with small modifications, for solving u; = &uxx as well. The exposition below 
assumes that the reader is familiar with the basic ideas of discretization and imple- 
mentation of wave equations from Chapter 2. Readers not familiar with the Forward 
Euler, Backward Euler, and Crank-Nicolson (or centered or midpoint) discretization 
methods in time should consult, e.g., Section 1.1 in [9]. 


3.1.1 The Initial-Boundary Value Problem for 1D Diffusion 


To obtain a unique solution of the diffusion equation, or equivalently, to apply nu- 
merical methods, we need initial and boundary conditions. The diffusion equation 
goes with one initial condition u(x,0) = I(x), where J is a prescribed function. 
One boundary condition is required at each point on the boundary, which in 1D 
means that u must be known, u, must be known, or some combination of them. 

We shall start with the simplest boundary condition: u = 0. The complete 
initial-boundary value diffusion problem in one space dimension can then be spec- 
ified as 


du du 

a tT x € (0, L), t € (0,T] (3.1) 
u(x,0) = I(x), x € [0, L] (3.2) 
u(0,t) = 0, t >Q, (3.3) 
u(L,t) = 0, t>0. (3.4) 


With only a first-order derivative in time, only one initial condition is needed, while 
the second-order derivative in space leads to a demand for two boundary condi- 
tions. We have added a source term f = f(x,t), which is convenient when testing 
implementations. 

Diffusion equations like (3.1) have a wide range of applications throughout phys- 
ical, biological, and financial sciences. One of the most common applications is 
propagation of heat, where u(x,t) represents the temperature of some substance at 
point x and time f. Other applications are listed in Sect. 3.8. 


3.1.2 Forward Euler Scheme 


The first step in the discretization procedure is to replace the domain [0, L] x [0, T] 
by a set of mesh points. Here we apply equally spaced mesh points 


xp=iAx, i1=0,..., Nx, 


and 
t, =nAt, n=O0,...,N;. 


3.1 An Explicit Method for the 1D Diffusion Equation 209 


Moreover, u? denotes the mesh function that approximates u(x;,4,) fori = 
0,...,N, and n = 0,...,N;. Requiring the PDE (3.1) to be fulfilled at a mesh 
point (x;, t„) leads to the equation 


ð a? 

—U(X;, tn) = AU (Xi, tn) + f (Xi, tn). 3.5 
TUCE n) = AU Ci bn) + SCs ta) 6.5) 
The next step is to replace the derivatives by finite difference approximations. The 
computationally simplest method arises from using a forward difference in time and 
a central difference in space: 


[Diu = &D; Dyu + fI}. (3.6) 
Written out, 
co — u” ue, — 2u} +u, 
i i =a pe a 3.7 
At Ax? +S 32) 


We have turned the PDE into algebraic equations, also often called discrete equa- 
tions. The key property of the equations is that they are algebraic, which makes 
them easy to solve. As usual, we anticipate that u” is already computed such that 


a is the only unknown in (3.7). Solving with respect to this unknown is easy: 


unt! = u? + F (u, —2u? + w) + Ath’, (3.8) 


where we have introduced the mesh Fourier number: 


F =a—. (3.9) 


F is the key parameter in the discrete diffusion equation 

Note that F is a dimensionless number that lumps the key physical parameter 
in the problem, œ, and the discretization parameters Ax and Af into a single 
parameter. Properties of the numerical method are critically dependent upon the 
value of F (see Sect. 3.3 for details). 


The computational algorithm then becomes 


ja 


. compute u? = I(x;) fori = 0,..., Nx 

2. forn = 0,1,...,.N;: 

(a) apply (3.8) for all the internal spatial points i = 1,...,N,—1 
(b) set the boundary values u7 +! = Ofori = Oandi = N, 


The algorithm is compactly and fully specified in Python: 


import numpy as np 


x = np.linspace(0, L, Nx+1) # mesh points in space 

dx = x[1] - x[0] 

t = np.linspace(0, T, Nt+1) # mesh points in time 

dt = t[1] - t[0] 

F = axdt/dx**2 

u = np.zeros(Nx+1) # unknown u at new time level 
u_n = np.zeros(Nx+1) # u at the previous time level 
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# Set initial condition u(x,0) = I(x) 
for i in range(0O, Nx+1): 
u nli] = I(x[i]) 


for n in range(0, Nt): 
# Compute u at inner mesh points 
for i in range(1, Nx): 
uli] = u_n[i] + F*(u_n[i-1] - 2*u_n[i] + u_n[i+1]) + \ 
dt*f(x[i], t[n]) 


# Insert boundary conditions 
ul0] = 0; ulNx] = 0 


# Update u_n before next step 
u_n[:]= u 


Note that we use a for a in the code, motivated by easy visual mapping between the 
variable name and the mathematical symbol in formulas. 

We need to state already now that the shown algorithm does not produce mean- 
ingful results unless F < 1/2. Why is explained in Sect. 3.3. 


3.1.3 Implementation 


The file diffu1D_u0.py contains a complete function solver_FE_simple for 
solving the 1D diffusion equation with u = 0 on the boundary as specified in the 
algorithm above: 


import numpy as np 


def solver_FE_simple(I, a, f, L, dt, F, T): 
nun 
Simplest expression of the computational algorithm 
using the Forward Euler method and explicit Python loops. 
For this method F <= 0.5 for stability. 


nun 


import time; t0 = time.clock() # For measuring the CPU time 


Nt = int (round(T/float(dt))) 

t = np.linspace(0, Nt*dt, Nt+1)  # Mesh points in time 
dx = np.sqrt(a*dt/F) 

Nx = int (round(L/dx) ) 

x = np.linspace(0, L, Nx+1) # Mesh points in space 
# Make sure dx and dt are compatible with x and t 

Gez = xl 101) 

dt = t[1] - t[0] 


u = np. zeros (Nx+1) 
u_n = np.zeros(Nx+1) 


# Set initial condition u(x,0) = I(x) 
for i in range(0, Nx+1): 
u_n[i] = I(x[i]) 
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for n in range(0, Nt): 


ti 


# Compute u at inner mesh points 
for i in range(1, Nx): 
uli] = u_n[i] + F*(u_n[i-1i] - 2*u_n[i] + u_n[i+i]) + \ 
dt*f(x[i], t[n]) 


# Insert boundary conditions 
u[0] = 0; ul[Nx] = 0 


# Switch variables before next step 
#u_n[:] =u # safe, but slow 


un eh Se iy un 


= time.clock() 


return u_n, x, t, ti-tO # u_n holds latest u 


A faster alternative is available in the function solver_FE, which adds the pos- 
sibility of solving the finite difference scheme by vectorization. The vectorized 
version replaces the explicit loop 


for i in range(1, Nx): 
uli] = u_n[i] + F*(u_n[i-1] - 2*u_n[i] + u_n[it+1]) \ 


+ dt*f(x[i], t[n]) 


by arithmetics on displaced slices of the u array: 


uli:Nx] 
# or 
wi eal] 


u_n[1:Nx] + F*(u_n[0:Nx-1] - 2*u_n[1:Nx] + u_n[2:Nx+1]) \ 
+ dt*f(x[1:Nx], t[n]) 


= u_n[1:-1] + F*(u_n[0:-2] - 2*u_n[1:-1] + u_n[2:]) \ 
+ dt*f(x[1:-1], t[n]) 


For example, the vectorized version runs 70 times faster than the scalar version in a 


case with 


100 time steps and a spatial mesh of 10° cells. 


The solver_FE function also features a callback function such that the user 
can process the solution at each time level. The callback function looks like 
user_action(u, x, t, n), where u is the array containing the solution at time 
level n, x holds all the spatial mesh points, while t holds all the temporal mesh 
points. The solver_FE function is very similar to solver_FE_simple above: 


det solver FEL a. ta L; dt, E T, 


user_action=None, version=’scalar’): 


Vectorized implementation of solver_FE_simple. 


import time; tO = time.clock() # for measuring the CPU time 
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Nt = int (round(T/float(dt))) 

t = np.linspace(0, Nt*dt, Nt+1)  # Mesh points in time 
dx = np.sqrt(a*dt/F) 

Nx = int (round(L/dx) ) 

x = np.linspace(0, L, Nx+1) # Mesh points in space 
# Make sure dx and dt are compatible with x and t 

cbs = sali] = o 

dt = t[1] - t[0] 


u = np.zeros(Nx+1)  # solution array 
u_n = np.zeros(Nx+1)  # solution at t-dt 
# Set initial condition 
for i in range(0,Nx+1): 

u_nfil = I(x[i)) 


if user_action is not None: 
user_action(u_n, x, t, 0) 


for n in range(0, Nt): 
# Update all inner points 
if version == ’scalar’: 
for i in range(1, Nx): 
uli] = u_n[i] +\ 
F*(u_n[i-1] - 2*u_n[i] + u_n[it+1]) +\ 
dt*f(x[i]l, t[n]) 


elif version == ’vectorized’: 
uli:Nx] = u_n[1i:Nx] + \ 
F*(u_n[0:Nx-1] - 2*u_n[1:Nx] + u_n[2:Nx+1i]) +\ 
dt*f(x[1:Nx], t[n]) 
else: 
raise ValueError(’version=/s’ % version) 


# Insert boundary conditions 

u[0] = 0; ul[Nx] = 0 

if user_action is not None: 
user_action(u, x, t, n+1) 


# Switch variables before next step 
umn, u-u, un 


ti = time.clock() 
return El- tO 


3.1.4 Verification 


Exact solution of discrete equations Before thinking about running the functions 
in the previous section, we need to construct a suitable test example for verification. 
It appears that a manufactured solution that is linear in time and at most quadratic 
in space fulfills the Forward Euler scheme exactly. With the restriction that u = 0 
for x = 0, L, we can try the solution 


u(x,t) = 5tx(L — x). 
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Inserted in the PDE, it requires a source term 
SF (x,t) = 10at + 5x(L — x). 


With the formulas from Appendix A.4 we can easily check that the manufactured u 
fulfills the scheme: 


[Diu =aD,D,u+ f]? = [Sx(L—x)D}t = 5taD,D,(xL — x’) 
+ 10at + 5x(L — x)|; 
= [5x(L — x) = 5ta(—2) + 10at + 5x(L — x)|; , 


which is a 0=0 expression. The computation of the source term, given any u, is 
easily automated with sympy: 


import sympy as sym 
x, t, a, L = sym.symbols(’x t a L’) 
u = x*(L-x) *54#t 


def pde(u): 
return sym.diff(u, t) - a*xsym.diff(u, x, x) 


f = sym.simplify (pde(u)) 


Now we can choose any expression for u and automatically get the suitable source 
term f. However, the manufactured solution u will in general not be exactly repro- 
duced by the scheme: only constant and linear functions are differentiated correctly 
by a forward difference, while only constant, linear, and quadratic functions are 
differentiated exactly by a [Dx Dyu]; difference. 

The numerical code will need to access the u and f above as Python func- 
tions. The exact solution is wanted as a Python function u_exact(x, t), while 
the source term is wanted as f(x, t). The parameters a and L in u and f above 
are symbols and must be replaced by float objects in a Python function. This can 
be done by redefining a and L as float objects and performing substitutions of 
symbols by numbers in u and f. The appropriate code looks like this: 


0.5 
T5 
u_exact = sym.lambdify( 

[x, t], u.subs(’L’, L).subs(’a’, a), modules=’numpy’) 
f = sym.lambdify( 

[x, tl, f.subs(’L’, L).subs(’a’, a), modules=’numpy’) 
I = lambda x: u_exact(x, 0) 


te 


Here we also make a function I for the initial condition. 

The idea now is that our manufactured solution should be exactly reproduced 
by the code (to machine precision). For this purpose we make a test function for 
comparing the exact and numerical solutions at the end of the time interval: 
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def test_solver_FE(): 
# Define u_exact, f, I as explained above 


dx = L/3 # 3 cells 
F=0.5 
dt = F*dx**2 


u, x, t, cpu = solver_FE_simple( 
I=I, asa, f=f, L=L, dt=dt, F=F, T=2) 
u_e = u_exact(x, t[-1]) 
diff = abs(u_e - u).max() 
tol = 1E-14 
assert diff < tol, ’max diff solver_FE_simple: %g’ % diff 


u, x, t, cpu = solver_FE( 
I=I, a=a, f=f, L=L, dt=dt, F=F, T=2, 
user_action=None, version=’scalar’) 
u_e = u_exact(x, t[-1]) 
diff = abs(u_e - u).max() 
tol = 1E-14 
assert diff < tol, ’max diff solver FE, scalar: 4g’ % diff 


u, x, t, cpu = solver_FE( 
I=I, a=a, f=f, L=L, dt=dt, F=F, T=2, 
user_action=None, version=’vectorized’) 
u_e = u_exact(x, t[-1]) 
diff = abs(u_e - u).max() 
tol = 1E-14 
assert diff < tol, ’max diff solver_FE, vectorized: wg’ % diff 


The critical value F = 0.5 

We emphasize that the value F=0 .5 is critical: the tests above will fail if F has a 

larger value. This is because the Forward Euler scheme is unstable for F > 1/2. 
The reader may wonder if F = 1/2 is safe or if F < 1/2 should be required. 

Experiments show that F = 1/2 works fine for u; = &uxx, so there is no 

accumulation of rounding errors in this case and hence no need to introduce any 

safety factor to keep F away from the limiting value 0.5. 


Checking convergence rates If our chosen exact solution does not satisfy the dis- 
crete equations exactly, we are left with checking the convergence rates, just as 
we did previously for the wave equation. However, with the Euler scheme here, 
we have different accuracies in time and space, since we use a second order ap- 
proximation to the spatial derivative and a first order approximation to the time 
derivative. Thus, we must expect different convergence rates in time and space. For 
the numerical error, 
E=C,At' + CyAx?, 


we should get convergence rates r = 1 and p = 2 (C, and C, are unknown con- 
stants). As previously, in Sect. 2.2.3, we simplify matters by introducing a single 
discretization parameter h: 


h=At, Ax = KW”, 
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where K is any constant. This allows us to factor out only one discretization pa- 
rameter h from the formula: 


E=C,A+C,(Kh"?)? = Čh', C=C,4+C,K’. 


The computed rate r should approach 1 with increasing resolution. 
It is tempting, for simplicity, to choose K = 1, which gives Ax = h’/?, ex- 
pected to be v At. However, we have to control the stability requirement: F < 5, 


which means ie ; 
aAt 1/2 
Re < > > Ax > V2ah'/*, 


implying that K = 2a is our choice in experiments where we lie on the stability 
limit F = 1/2. 


3.1.5 Numerical Experiments 


When a test function like the one above runs silently without errors, we have some 
evidence for a correct implementation of the numerical method. The next step is to 
do some experiments with more interesting solutions. 

We target a scaled diffusion problem where x/ ZL is a new spatial coordinate and 
at /L? is a new time coordinate. The source term f is omitted, and u is scaled by 
max,<¢(0,z] |Z (x)| (see Section 3.2 in [11] for details). The governing PDE is then 


ou _ 3u 
at 3x?’ 


in the spatial domain [0, L], with boundary conditions u(0) = u(1) = 0. Two 
initial conditions will be tested: a discontinuous plug, 


0, — L/2| > 0.1 
gi. 
1, otherwise 


and a smooth Gaussian function, 
I(x) = en dg? FEZ 


The functions plug and gaussian in diffuiD_u0.py run the two cases, respec- 
tively: 


def plug(scheme=’FE’, F=0.5, Nx=50): 


L=i 
a= 1. 
T=0.1 


# Compute dt from Nx and F 
dx = L/Nx; dt = F/a*dx**2 
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def I(x): 
"""Plug profile as initial condition.""" 
if abs(x-L/2.0) > 0.1: 
return 0 
else: 
return 1 


Cyl viz Bi, Ing Glia, E E 

umin=-0.1, umax=1.1, 

scheme=scheme, animate=True, framefiles=True) 
print ’CPU time:’, cpu 


def gaussian(scheme=’FE’, F=0.5, Nx=50, sigma=0.05): 


Ih ak, 
a=1. 
Te Ojwal 


# Compute dt from Nx and F 
dx = L/Nx; dt = F/a*xdx**2 


def I(x): 
"""Gaussian profile as initial condition.""" 
return exp(-0.5*((x-L/2.0)**2)/sigma**2) 


Dy Ga = Walsh, Gy iby Cliby Wy I, 

umin=-0.1, umax=1.1, 

scheme=scheme, animate=True, framefiles=True) 
print ’CPU time:’, cpu 


These functions make use of the function viz for running the solver and visualizing 
the solution using a callback function with plotting: 


def vizi a, Ih, Clog E T, umin. umar, 
scheme=’FE’, animate=True, framefiles=True): 


detiplot ulu. x, tm): 
plt.plot(x, u, ’r-’, axis=[0, L, umin, umax], 
title=’t=/f’ % t[n]) 

if framefiles: 
plt.savefig(’tmp_frame/04d.png’ % n) 

if t[n] == 
time.sleep(2) 

elif not framefiles: 
# It takes time to write files so pause is needed 
# for screen only animation 
time.sleep(0.2) 


user_action = plot_u if animate else lambda u,x,t,n: None 
cpu = eval(’solver_’+scheme)(I, a, L, dt, F, T, 


user_action=user_action) 
return cpu 


Notice that this viz function stores all the solutions in a list solutions in the 
callback function. Modern computers have hardly any problem with storing a lot 
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Fig. 3.1 Forward Euler scheme for F = 0.5 


of such solutions for moderate values of N, in 1D problems, but for 2D and 3D 
problems, this technique cannot be used and solutions must be stored in files. 

Our experiments employ a time step At = 0.0002 and simulate for £ € [0, 0.1]. 
First we try the highest value of F: F = 0.5. This resolution corresponds to 
N, = 50. A possible terminal command is 


Terminal 


Terminal> python -c ’from diffuiD_uO import gaussian 
gaussian("solver_FE", F=0.5, dt=0.0002)’ 


The u(x,t) curve as a function of x is shown in Fig. 3.1 at four time levels. 


Movie 1 https://raw.githubusercontent.com/hplgit/fdm-book/master/doc/pub/book/html/ 
mov-diffu/difful D_u0_FE_plug/movie.ogg 


We see that the curves have saw-tooth waves in the beginning of the simulation. 
This non-physical noise is smoothed out with time, but solutions of the diffusion 
equations are known to be smooth, and this numerical solution is definitely not 
smooth. Lowering F helps: F < 0.25 gives a smooth solution, see Fig. 3.2 (and 
a movie!). 


' http://tinyurl.com/gokgkov/mov-diffu/diffulD_u0_FE_plug_F025/movie.ogg 
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Fig. 3.2 Forward Euler scheme for F = 0.25 


Increasing F slightly beyond the limit 0.5, to F = 0.51, gives growing, non- 
physical instabilities, as seen in Fig. 3.3. 

Instead of a discontinuous initial condition we now try the smooth Gaussian 
function for (x). A simulation for F = 0.5 is shown in Fig. 3.4. Now the numeri- 
cal solution is smooth for all times, and this is true for any F < 0.5. 

Experiments with these two choices of 7 (x) reveal some important observations: 


e The Forward Euler scheme leads to growing solutions if F > h. 
I(x) as a discontinuous plug leads to a saw tooth-like noise for F = L, which is 
absent for F < h 


e The smooth Gaussian initial function leads to a smooth solution for all relevant 
F values (F < 5). 


3.2 Implicit Methods for the 1D Diffusion Equation 


Simulations with the Forward Euler scheme show that the time step restriction, 
F < L, which means At < Ax?/(2æ), may be relevant in the beginning of the 
diffusion process, when the solution changes quite fast, but as time increases, the 
process slows down, and a small Af may be inconvenient. With implicit schemes, 
which lead to coupled systems of linear equations to be solved at each time level, 
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Fig. 3.3 Forward Euler scheme for F = 0.51 


any size of Af is possible (but the accuracy decreases with increasing At). The 
Backward Euler scheme, derived and implemented below, is the simplest implicit 
scheme for the diffusion equation. 


3.2.1 Backward Euler Scheme 


In (3.5), we now apply a backward difference in time, but the same central differ- 
ence in space: 


[D;u = D,Dyu + fI}, (3.10) 
which written out reads 
u” — u”! ut, = 2u +u? 
l E A L l t= 'n . 3.11 
At x Ax? +h oa 


Now we assume uw”! is already computed, but that all quantities at the “new” time 


i 
level n are unknown. This time it is not possible to solve with respect to u? because 


this value couples to its neighbors in space, u?_, and uw’, ,, which are also unknown. 


Let us examine this fact for the case when N, = 3. Equation (3.11) written for 
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Fig. 3.4 Forward Euler scheme for F = 0.5 


i = 1,...,Nx— 1 = 1,2 becomes 


n n—1 n n n 
uy Ty us — 2u] + Uy n 
= 3.12 
Ar Q Ax? + 1 ( ) 
n n—1 n n n 
Ug Uo u3 — 2u5 + ui n 
= R 3.13 
At a Ax + 2 ( ) 


The boundary values uj and u} are known as zero. Collecting the unknown new 
values uj and u3 on the left-hand side and multiplying by At gives 


(1+2F)u? — Fuk =u"! + Atf?, (3.14) 
Ful + (1+2F)us = us! + Arf. (3.15) 


This is a coupled 2 x 2 system of algebraic equations for the unknowns u} and u3. 
The equivalent matrix form is 


1+2F =F ui \ url + Arf 
-F 1+2F ul us t+ Arf J 
Terminology: implicit vs. explicit methods 
Discretization methods that lead to a coupled system of equations for the un- 


known function at a new time level are said to be implicit methods. The coun- 
terpart, explicit methods, refers to discretization methods where there is a simple 
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explicit formula for the values of the unknown function at each of the spatial 
mesh points at the new time level. From an implementational point of view, im- 
plicit methods are more comprehensive to code since they require the solution of 
coupled equations, i.e., a matrix system, at each time level. With explicit meth- 
ods we have a closed-form formula for the value of the unknown at each mesh 
point. 

Very often explicit schemes have a restriction on the size of the time step 
that can be relaxed by using implicit schemes. In fact, implicit schemes are 
frequently unconditionally stable, so the size of the time step is governed by 
accuracy and not by stability. This is the great advantage of implicit schemes. 


In the general case, (3.11) gives rise to a coupled (N, — 1) x (N, — 1) system 
of algebraic equations for all the unknown u? at the interior spatial points i = 


1,..., Nx — 1. Collecting the unknowns on the left-hand side, (3.11) can be written 

— Ful_,+(1+2F)u? — Fut,, =u, (3.16) 
fori = 1,..., Ny — 1. One can either view these equations as a system where the 
u? values at the internal mesh points, i = 1,..., N, — 1, are unknown, or we may 
append the boundary values uj and wy, to the system. In the latter case, all u; for 
i = 0,..., Ny are considered unknown, and we must add the boundary equations 


to the Ny — 1 equations in (3.16): 


in =O, (3.17) 
uly, = 0. (3.18) 
A coupled system of algebraic equations can be written on matrix form, and this 


is important if we want to call up ready-made software for solving the system. The 
equations (3.16) and (3.17)—(3.18) correspond to the matrix equation 


AU =b 
where U = (uj,..-, uN)» and the matrix A has the following structure: 
Aoo Áo. 0 disks we ssi sda sda 0 
Aio Áa Aig 
0 Anu Azz A23 
; . ; r 


O Aiia Aii Aiii 
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The nonzero elements are given by 


Aii =F (3.20) 

Aii =1+2F (3.21) 

Ajiz1 = —F (3.22) 

in the equations for internal points, i = 1,..., Ny — 1. The first and last equation 


correspond to the boundary condition, where we know the solution, and therefore 
we must have 


Apo = 1, (3.23) 
Ao, = 0, (3.24) 
An,.N,-1 = 9, (3.25) 
Ann, = 1. (3.26) 
The right-hand side b is written as 
bo 
bı 
AR 3.27 
by (3.27) 
by, 
with 
bo = 0, (3.28) 
bi =u, FS Mage =i, (3.29) 
by (3.30) 


We observe that the matrix A contains quantities that do not change in time. 
Therefore, A can be formed once and for all before we enter the recursive formulas 
for the time evolution. The right-hand side b, however, must be updated at each 
time step. This leads to the following computational algorithm, here sketched with 
Python code: 


x = np.linspace(0O, L, Nx+1) # mesh points in space 
dx = x[1] - x[0] 


t = np.linspace(0, T, N+1) # mesh points in time 
u = np. zeros (Nx+1) # unknown u at new time level 
u_n = np.zeros(Nx+1) # u at the previous time level 


3.2 Implicit Methods for the 1D Diffusion Equation 223 


# Data structures for the linear system 
A = np.zeros((Nx+1, Nx+1)) 
b = np.zeros (Nx+1) 


for i in range(1, Nx): 
A(i,i-1] = -F 
A[i,i+1] = -F 
A[i,i] = 1 + 2*F 

A[0,0] = A[Nx,Nx] = 1 


# Set initial condition u(x,0) = I(x) 
for i in range(0, Nx+1): 
unli = TE] 


import scipy.linalg 


for n in range(0, Nt): 
# Compute b and solve linear system 
for i in range(1, Nx): 
b[i] = -u_n[i] 
b[0] = b[Nx] = 0 
u[:] = scipy.linalg.solve(A, b) 


# Update u_n before next step 
u_n[:] =u 


Regarding verification, the same considerations apply as for the Forward Euler 
method (Sect. 3.1.4). 


3.2.2 Sparse Matrix Implementation 


We have seen from (3.19) that the matrix A is tridiagonal. The code segment above 
used a full, dense matrix representation of A, which stores a lot of values we know 
are zero beforehand, and worse, the solution algorithm computes with all these 
zeros. With N, + 1 unknowns, the work by the solution algorithm is + (Ny +1) 
and the storage requirements (Ny + 1)?. By utilizing the fact that A is tridiagonal 
and employing corresponding software tools that work with the three diagonals, the 
work and storage demands can be proportional to N, only. This leads to a dramatic 
improvement: with N, = 200, which is a realistic resolution, the code runs about 
40,000 times faster and reduces the storage to just 1.5%! It is no doubt that we 
should take advantage of the fact that A is tridiagonal. 

The key idea is to apply a data structure for a tridiagonal or sparse matrix. The 
scipy.sparse package has relevant utilities. For example, we can store only the 
nonzero diagonals of a matrix. The package also has linear system solvers that 
operate on sparse matrix data structures. The code below illustrates how we can 
store only the main diagonal and the upper and lower diagonals. 
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# Representation of sparse matrix and right-hand side 
main = np.zeros(Nx+1) 

lower = np.zeros(Nx) 

upper = np.zeros(Nx) 

b = np.zeros(Nx+1) 


# Precompute sparse matrix 
main[:] = 1 + 2*F 

lower[:] = -F 

upper[:] = -F 

# Insert boundary conditions 
main[0] = 1 

main[Nx] = 1 


A = scipy.sparse.diags( 
diagonals=[main, lower, upper], 
offsets=[0, -1, 1], shape=(Nxt1, Nx+1), 
format=’ csr’) 

print A.todense() # Check that A is correct 


# Set initial condition 
for i in range(0,Nx+1): 
u nli] = I(«[i)) 


for n in range(0, Nt): 
be = wen 
b[0] = b[-1] = 0.0 # boundary conditions 
ul:] = scipy.sparse.linalg.spsolve(A, b) 
un[:] =u 


The scipy.sparse.linalg.spsolve function utilizes the sparse storage struc- 
ture of A and performs, in this case, a very efficient Gaussian elimination solve. 

The program diffu1iD_u0.py contains a function solver_BE, which imple- 
ments the Backward Euler scheme sketched above. As mentioned in Sect. 3.1.2, 
the functions plug and gaussian run the case with /(x) as a discontinuous plug or 
a smooth Gaussian function. All experiments point to two characteristic features of 
the Backward Euler scheme: 1) it is always stable, and 2) it always gives a smooth, 
decaying solution. 


3.2.3 Crank-Nicolson Scheme 


The idea in the Crank-Nicolson scheme is to apply centered differences in space 
and time, combined with an average in time. We demand the PDE to be fulfilled at 
the spatial mesh points, but midway between the points in the time mesh: 


ð a 
gr” (tne) = agat (Binns) + F (otag): 


fori =1,...,N,—landn = 0,..., N; — 1. 
With centered differences in space and time, we get 


n+} 
i 


[D;u =aD,Dyu+ f] 
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On the right-hand side we get an expression 


1 nt} nts n+4 nta 
aa (eh =a i HU +i 


1 
: : : : : n+5. 
This expression is problematic since u; ° is not one of the unknowns we compute. 


“Ly : n+} : : 
A possibility is to replace u, ° by an arithmetic average: 


In the compact notation, we can use the arithmetic average notation 7’: 


mt) 


[D,u =aD,D,u' + f]; 


nt+3 
We can also use an average for f; °: 


nta 


[D;u =aD,D,u' + F] 


After writing out the differences and average, multiplying by Af, and collecting 
all unknown terms on the left-hand side, we get 


wit — OF (ull — ult + uth) = at + SF (uly 2t A utpa) 


+ -a ma i , (3.31) 
Also here, as in the Backward Euler scheme, the new unknowns utt, ic and 
uti are coupled in a linear system AU = b, where A has the same structure as in 


(3.19), but with slightly different entries: 


1 
Aii- = -5 (3.32) 
Ap =1+F (3.33) 
1 
aa = =a (3.34) 
in the equations for internal points, i = 1,..., N, — 1. The equations for the 
boundary points correspond to 
Aoo = 1, (3.35) 
Ao,1 = 0, (3.36) 
An,.Ny-1 = 9, (3.37) 


ANN = 1. (3.38) 
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The right-hand side b has entries 


by =0, (3.39) 
1 

bi =u UET i=1,...,N,—1, (3.40) 

by. =0. (3.41) 


x 


When verifying some implementation of the Crank-Nicolson scheme by conver- 
gence rate testing, one should note that the scheme is second order accurate in both 
space and time. The numerical error then reads 


E=C,At'+C,Ax’, 


where r = 2 (C, and C, are unknown constants, as before). When introducing a 
single discretization parameter, we may now simply choose 


h = Ax = At, 
which gives 
E = Ch” + Cyh" a (C: + C,)h’, 


where r should approach 2 as resolution is increased in the convergence rate com- 
putations. 


3.2.4 The Unifying 0 Rule 


For the equation 


ðu 
— = G(u), 
J7 (u) 
where G (u) is some spatial differential operator, the 6-rule looks like 
yer EP 


“i “i L Gut) + (1—8)G(u"). 


The important feature of this time discretization scheme is that we can implement 
one formula and then generate a family of well-known and widely used schemes: 


e 0 = 0 gives the Forward Euler scheme in time 

e 0 = 1 gives the Backward Euler scheme in time 

e d= 5 gives the Crank-Nicolson scheme in time 

In the compact difference notation, we write the @ rule as 


[D,u =aD,D,u}"t?. 


We have that t49 = Ot)4,; + (1 — 0)tn. 
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Applied to the 1D diffusion problem, the 0-rule gives 


yt =y yt = oT a + we uipi 
=a +(1- 8) 


i i fa) i+l 
Ax? 


= 2u} + ul, 
At 
+ of? 4 a _ Of” f 


Ax? 


This scheme also leads to a matrix system with entries 
Aii- = —Fọ9, Aji =1 + 2F8@, Aii+l = —F6, 
while right-hand side entry b; is 


— 2u” + u” 
i i—l + Atop" AAOS”. 
Ax? 


ul 
bi = u! + Fd — 60) ŻH 


The corresponding entries for the boundary points are as in the Backward Euler and 
Crank-Nicolson schemes listed earlier. 

Note that convergence rate testing with implementations of the theta rule must 
adjust the error expression according to which of the underlying schemes is ac- 
tually being run. That is, if 0 = 0 (i.e., Forward Euler) or 6 = 1 (i.e., Back- 
ward Euler), there should be first order convergence, whereas with 0 = 0.5 (i.e., 
Crank-Nicolson), one should get second order convergence (as outlined in previous 
sections). 


3.2.5 Experiments 


We can repeat the experiments from Sect. 3.1.5 to see if the Backward Euler or 
Crank-Nicolson schemes have problems with sawtooth-like noise when starting 
with a discontinuous initial condition. We can also verify that we can have F > i, 
which allows larger time steps than in the Forward Euler method. 

The Backward Euler scheme always produces smooth solutions for any F. Fig- 
ure 3.5 shows one example. Note that the mathematical discontinuity at £ = 0 leads 
to a linear variation on a mesh, but the approximation to a jump becomes better as 
N, increases. In our simulation, we specify At and F, and set N, to L/./aAt/F. 
Since N, ~ ~F, the discontinuity looks sharper in the Crank-Nicolson simulations 
with larger F. 

The Crank-Nicolson method produces smooth solutions for small F, F < i, 
but small noise gets more and more evident as F increases. Figures 3.6 and 3.7 
demonstrate the effect for F = 3 and F = 10, respectively. Section 3.3 explains 
why such noise occur. 


3.2.6 The Laplace and Poisson Equation 
The Laplace equation, V?u = 0, and the Poisson equation, —V7u = f, occur in 


numerous applications throughout science and engineering. In 1D these equations 
read u”(x) = 0 and —u” (x) = f(x), respectively. We can solve 1D variants of the 
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Fig.3.5 Backward Euler scheme for F = 0.5 


Laplace equations with the listed software, because we can interpret uyy = 0 as the 
limiting solution of u; = au, when u reaches a steady state limit where u, — 0. 
Similarly, Poisson’s equation —u,, = f arises from solving u; = uy, + f and 
letting t > co sou; > 0. 

Technically in a program, we can simulate £ — oo by just taking one large time 
step: At — oo. In the limit, the Backward Euler scheme gives 


n+l _ n+l n+1 
Mig 2u; Huja = fr 
Ax? £ 


which is nothing but the discretization [—-D, D,u = f | = 0 of —u,, = f. 

The result above means that the Backward Euler scheme can solve the limit 
equation directly and hence produce a solution of the 1D Laplace equation. With 
the Forward Euler scheme we must do the time stepping since At > Ax?/æ is 
illegal and leads to instability. We may interpret this time stepping as solving the 
equation system from —u,, = f by iterating on a pseudo time variable. 
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Fig. 3.6 Crank-Nicolson scheme for F = 3 


3.3 Analysis of Schemes for the Diffusion Equation 


The numerical experiments in Sect. 3.1.5 and 3.2.5 reveal that there are some nu- 
merical problems with the Forward Euler and Crank-Nicolson schemes: sawtooth- 
like noise is sometimes present in solutions that are, from a mathematical point of 
view, expected to be smooth. This section presents a mathematical analysis that 
explains the observed behavior and arrives at criteria for obtaining numerical solu- 
tions that reproduce the qualitative properties of the exact solutions. In short, we 
shall explain what is observed in Fig. 3.1-3.7. 


3.3.1 Properties of the Solution 
A particular characteristic of diffusive processes, governed by an equation like 

Ur = QUyy, (3.42) 
is that the initial shape u(x,0) = I(x) spreads out in space with time, along with 


a decaying amplitude. Three different examples will illustrate the spreading of u in 
space and the decay in time. 
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Fig. 3.7 Crank-Nicolson scheme for F = 10 


Similarity solution The diffusion equation (3.42) admits solutions that depend on 
n = (x —c)/~4at for a given value of c. One particular solution is 


u(x,t) = aerf(y) + b, (3.43) 
where 
r n 
2o 3 
erf(n) = |e dé, (3.44) 
0 


is the error function, and a and b are arbitrary constants. The error function lies in 
(—1, 1), is odd around n = 0, and goes relatively quickly to +1: 


lim erf(y) = —1, 
n—-—0o 

lim erf(7) = 1, 

noo 


erf(n) = —erf(—n), 
erf(0) = 0, 

erf(2) = 0.99532227, 
erf(3) = 0.99997791 . 
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As t — 0, the error function approaches a step function centered at x = c. For 
a diffusion problem posed on the unit interval [0,1], we may choose the step at 
x = 1/2 (meaning c = 1/2),a = —1/2, b = 1/2. Then 


= gel =le oe 3.45 
u(x,t} = 5 — er ET =a Seal (3.45) 


where we have introduced the complementary error function erfc(n) = 1 — erf(n). 
The solution (3.45) implies the boundary conditions 


oe ial (3.46) 
2 /4at 

ul =2 iea LAN. (3.47) 
2 V4at 


For small enough f, u(0,t) ~ 1 and u(1,t) ~ 0, but as t > œ, u(x,t) > 1/2 on 
[0, 1]. 


Solution for a Gaussian pulse The standard diffusion equation u; = au, admits 


a Gaussian function as solution: 
(x —c) 
exp | -———— ]. (3.48) 


= 4at 


1 
V4Arat 
Att = 0 this is a Dirac delta function, so for computational purposes one must start 
to view the solution at some time £ = te > 0. Replacing t by te + t in (3.48) makes 
it easy to operate with a (new) ¢ that starts at ¢ = O with an initial condition with 
a finite width. The important feature of (3.48) is that the standard deviation o of a 
sharp initial Gaussian pulse increases in time according too = /2at, making the 
pulse diffuse and flatten out. 


Solution for a sine component Also, (3.42) admits a solution of the form 
u(x,t) = Oe sin (kx) . (3.49) 


The parameters Q and k can be freely chosen, while inserting (3.49) in (3.42) gives 
the constraint 
a = —ak?. 


A very important feature is that the initial shape 7(x) = Q sin (kx) undergoes a 
damping exp (—awkr), meaning that rapid oscillations in space, corresponding to 
large k, are very much faster dampened than slow oscillations in space, correspond- 
ing to small k. This feature leads to a smoothing of the initial condition with time. 
(In fact, one can use a few steps of the diffusion equation as a method for removing 
noise in signal processing.) To judge how good a numerical method is, we may look 
at its ability to smoothen or dampen the solution in the same way as the PDE does. 
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Fig. 3.8 Evolution of the solution of a diffusion problem: initial condition (upper left), 1/100 
reduction of the small waves (upper right), 1/10 reduction of the long wave (lower left), and 1/100 
reduction of the long wave (lower right) 


The following example illustrates the damping properties of (3.49). We consider 
the specific problem 


Ur = Uxx, x € (0,1), t € (0,T], 
u(0,t) = u(1,t) = 0, t e (0, T], 
u(x, 0) = sin(x x) + 0.1 sin(1007 x). 


The initial condition has been chosen such that adding two solutions like (3.49) 
constructs an analytical solution to the problem: 


u(x,t) = e77" sin(a x) + 0.1e~* 1 sin(1007 x). (3.50) 


Figure 3.8 illustrates the rapid damping of rapid oscillations sin(1007x) and the 
very much slower damping of the slowly varying sin(x) term. After about £ = 
0.5 - 1074 the rapid oscillations do not have a visible amplitude, while we have to 
wait until £ ~ 0.5 before the amplitude of the long wave sin(z.x) becomes very 
small. 
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3.3.2 Analysis of Discrete Equations 
A counterpart to (3.49) is the complex representation of the same function: 

u(x,t) = Qe ei, 


where i = /—1 is the imaginary unit. We can add such functions, often referred to 
as wave components, to make a Fourier representation of a general solution of the 
diffusion equation: 
ine) bee, (3.51) 
keK 


where K is a set of an infinite number of k values needed to construct the solution. 
In practice, however, the series is truncated and K is a finite set of k values needed 
to build a good approximate solution. Note that (3.50) is a special case of (3.51) 
where K = {x, 1007}, bz = 1, and Diogo, = 0.1. 

The amplitudes b;, of the individual Fourier waves must be determined from the 
initial condition. Att = 0 we have u ~ >, by exp (ikx) and find K and by such 
that 

I(x) = Do bel. (3.52) 


keK 


(The relevant formulas for by come from Fourier analysis, or equivalently, a least- 
squares method for approximating I(x) in a function space with basis exp (ikx).) 

Much insight about the behavior of numerical methods can be obtained by inves- 
tigating how a wave component exp (—wk?r) exp (ik x) is treated by the numerical 
scheme. It appears that such wave components are also solutions of the schemes, 
but the damping factor exp (—ak7r) varies among the schemes. To ease the forth- 
coming algebra, we write the damping factor as A”. The exact amplification factor 
corresponding to A is Ag = exp (—ak7 At). 


3.3.3 Analysis of the Finite Difference Schemes 


We have seen that a general solution of the diffusion equation can be built as a linear 

combination of basic components 
eek t e! kx 

A fundamental question is whether such components are also solutions of the finite 

difference schemes. This is indeed the case, but the amplitude exp (—æk?t) might 


be modified (which also happens when solving the ODE counterpart u’ = —au). 
We therefore look for numerical solutions of the form 


n _ yn ikg Ax _ gn ikx 
u,=A e = Aves (3.53) 
where the amplification factor A must be determined by inserting the component 


into an actual scheme. Note that A” means A raised to the power of n, n being the 
index in the time mesh, while the superscript n in u% just denotes u at time t}. 
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Stability The exact amplification factor is Ae = exp (—a?k* At). We should there- 
fore require |A| < 1 to have a decaying numerical solution as well. If —1 < A < 0, 
A” will change sign from time level to time level, and we get stable, non-physical 
oscillations in the numerical solutions that are not present in the exact solution. 


Accuracy To determine how accurately a finite difference scheme treats one wave 
component (3.53), we see that the basic deviation from the exact solution is reflected 
in how well A” approximates A?, or how well A approximates Ag. We can plot Ae 
and the various expressions for A, and we can make Taylor expansions of A/Ae to 
see the error more analytically. 


Truncation error As an alternative to examining the accuracy of the damping of 
a wave component, we can perform a general truncation error analysis as explained 
in Appendix B. Such results are more general, but less detailed than what we get 
from the wave component analysis. The truncation error can almost always be 
computed and represents the error in the numerical model when the exact solution 
is substituted into the equations. In particular, the truncation error analysis tells 
the order of the scheme, which is of fundamental importance when verifying codes 
based on empirical estimation of convergence rates. 


3.3.4 Analysis of the Forward Euler Scheme 


The Forward Euler finite difference scheme for u; = au, can be written as 
+ 
[Du = aD, Dyu] . 


Inserting a wave component (3.53) in the scheme demands calculating the terms 


ikqAx n ikqAx gn A-1 
eiA [DF Al” = eb F442 A ETE 
oe 4 kA 
n ikx n ikqAx + 2 x 
A" Dx Dyle] = A (-« q re) sin 63) : 


Inserting these terms in the discrete equation and dividing by A” e’*74* leads to 


A-1 4 (=) 
=-a sin ‘ 


At Ax? 2 


and consequently 


A=1-4F sin’ p (3.54) 
where 
ra (3.55) 
~ Ax? ` 


is the numerical Fourier number, and p = kAx/2. The complete numerical solu- 
tion is then 
u” = (1 — 4F sin? p)” e'k44*. (3.56) 
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Stability We easily see that A < 1. However, the A can be less than —1, which 
will lead to growth of a numerical wave component. The criterion A > —1 implies 


4F sin?(p/2) < 2. 


The worst case is when sin?(p/2) = 1, so a sufficient criterion for stability is 


1 
F<-, 3.57 
=5 (3.57) 
or expressed as a condition on At: 
A 2 
gee. (3.58) 
2a 


Note that halving the spatial mesh size, Ax > 5 Ax, requires At to be reduced by 
a factor of 1/4. The method hence becomes very expensive for fine spatial meshes. 


Accuracy Since A is expressed in terms of F and the parameter we now call p = 
kAx/2, we should also express Ae by F and p. The exponent in Ae is —wk7 At, 
which equals —Fk* Ax? = —F 4p?. Consequently, 


Ae = exp (—ak? At) = exp (—4Fp’). 


All our A expressions as well as Ae are now functions of the two dimensionless 
parameters F and p. 

Computing the Taylor series expansion of A / Ae in terms of F can easily be done 
with aid of sympy: 


def A_exact(F, p): 
return exp(-4*F*p**2) 


def A_FE(F, p): 
return 1 - 4*F*sin(p)**2 


from sympy import * 

F, p = symbols(’F p’) 

A_err_FE = A_FE(F, p)/A_exact(F, p) 
print A_err_FE.series(F, 0, 6) 


The result is 


A 
z= 1—4F sin’ p + 2Fp? —16F*p’ sin? p + 8F7p* +- 
e 


Recalling that F = «At/Ax?, p = kAx/2, and that sin? p < 1, we realize that 
the dominating terms in A/Ae are at most 


At 
1=40— + gAt — 4 AP? + WAP A 4, 
Ax? 
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Truncation error We follow the theory explained in Appendix B. The recipe is to 
set up the scheme in operator notation and use formulas from Appendix B.2.4 to de- 
rive an expression for the residual. The details are documented in Appendix B.6.1. 
We end up with a truncation error 


R? = O(At) + O(Ax’). 
Although this is not the true error ue(x;, tn) — u7, it indicates that the true error is 


of the form 
E =C,At + C,Ax? 


for two unknown constants C, and C,. 


3.3.5 Analysis of the Backward Euler Scheme 
Discretizing u; = aux, by a Backward Euler scheme, 
[D7 u = aD, Duly, 


and inserting a wave component (3.53), leads to calculations similar to those arising 
from the Forward Euler scheme, but since 


1— A! 
ikqAx D- Ay" = A” ikqAx , 
e MAID A] = Area E 
we get 
1 — A7! 4. f{kAx 
=-a sin ; 
At Ax? 2 
and then B 
A= (1+4F sin? p) . (3.59) 
The complete numerical solution can be written 
un = (1 + 4F sin? p)” e^, (3.60) 


Stability We see from (3.59) that 0 < A < 1, which means that all numerical wave 
components are stable and non-oscillatory for any At > 0. 


Truncation error The derivation of the truncation error for the Backward Euler 
scheme is almost identical to that for the Forward Euler scheme. We end up with 


R? = O(At) + O(Ax’). 
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3.3.6 Analysis of the Crank-Nicolson Scheme 


The Crank-Nicolson scheme can be written as 


n+4 
[Du = «D Du", °, 


or 
n+} 


1 n n 
[Diu] = 7” (ee F PAT A - 


Inserting (3.53) in the time derivative approximation leads to 


l a 
Az—A? — gneikgax A= 1 


; 1 i g 
D A” eikaAx n+5 — Alt eikqAx 
LD: At At 


Inserting (3.53) in the other terms and dividing by A”e’*44* gives the relation 


A-1 1 4 .4(kAx 
= sin 3 (1+ A), 


At 2 Ax 
and after some more algebra, 


1 —2F sin? 
A= sin^ p 


=n, 3.61 
1 +2F sin? p an 
The exact numerical solution is hence 
1—2F sin? p\" , 
wa E ee (3.62) 
14+ 2F sinf p 


Stability The criteria A > —1 and A < 1 are fulfilled for any At > 0. Therefore, 
the solution cannot grow, but it will oscillate if 1 — 2F sin? < 0. To avoid such 
1 


non-physical oscillations, we must demand F < 5. 


Truncation error The truncation error is derived in Appendix B.6.1: 


n+} 
i 


R? = O(Ax?) + O(At?’). 


3.3.7 Analysis of the Leapfrog Scheme 


An attractive feature of the Forward Euler scheme is the explicit time stepping and 
no need for solving linear systems. However, the accuracy in time is only O(Af). 
We can get an explicit second-order scheme in time by using the Leapfrog method: 


[Duu = aD,Dxu+ fae 


Written out, 


yit! = n—1 ae 2aAt (u 


pe (her = 20g + ha) + SO). 
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Fig. 3.9 Amplification factors for large time steps 


We need some formula for the first step, ul, but for that we can use a Forward Euler 
step. 

Unfortunately, the Leapfrog scheme is always unstable for the diffusion equa- 
tion. To see this, we insert a wave component A”e' and get 


A-A! 3 
— = —a—~ sin’ p, 
Al ig” 
or 
A? +4F sin? pA—1=0, 


A = —2F sin? p + y4F? sint p +1. 


Both roots have |A| > 1 so the amplitude always grows, which is not in accordance 
with the physics of the problem. However, for a PDE with a first-order derivative 
in space, instead of a second-order one, the Leapfrog scheme performs very well. 
Details are provided in Sect. 4.1.3. 


which has roots 


3.3.8 Summary of Accuracy of Amplification Factors 


We can plot the various amplification factors against p = kAx/2 for different 
choices of the F parameter. Figures 3.9, 3.10, and 3.11 show how long and small 
waves are damped by the various schemes compared to the exact damping. As 
long as all schemes are stable, the amplification factor is positive, except for Crank- 
Nicolson when F > 0.5. 

The effect of negative amplification factors is that A” changes sign from one 
time level to the next, thereby giving rise to oscillations in time in an animation of 
the solution. We see from Fig. 3.9 that for F = 20, waves with p > 2/4 undergo 
a damping close to —1, which means that the amplitude does not decay and that the 
wave component jumps up and down (flips amplitude) in time. For F = 2 we have 
a damping of a factor of 0.5 from one time level to the next, which is very much 
smaller than the exact damping. Short waves will therefore fail to be effectively 
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Fig. 3.10 Amplification factors for time steps around the Forward Euler stability limit 


F=0.1 F=0.01 


0.0 0.2 0.4 0.6 0.8 1.0 a2 1.4 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 
p=kAx p=kAx 


Fig. 3.11 Amplification factors for small time steps 


dampened. These waves will manifest themselves as high frequency oscillatory 
noise in the solution. 

A value p = 1/4 corresponds to four mesh points per wave length of e'“*, while 
p = 1/2 implies only two points per wave length, which is the smallest number of 
points we can have to represent the wave on the mesh. 

To demonstrate the oscillatory behavior of the Crank-Nicolson scheme, we 
choose an initial condition that leads to short waves with significant amplitude. A 
discontinuous /(x) will in particular serve this purpose: Figures 3.6 and 3.7 corre- 
spond to F = 3 and F = 10, respectively, and we see how short waves pollute the 
overall solution. 


ikx 


3.3.9 Analysis of the 2D Diffusion Equation 


Diffusion in several dimensions is treated later, but it is appropriate to include the 
analysis here. We first consider the 2D diffusion equation 


us, = Q (Uxx zJ uyy), 
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which has Fourier component solutions of the form 
2 ; ay 
u(x, y, t) = Ae % t ei lkxxtkyy) 
and the schemes have discrete versions of this Fourier component: 


n _ n ji(kyqAx+kyrAy) 
ugr = Ae ? : 


The Forward Euler scheme For the Forward Euler discretization, 


[D u = a(D,D,u + Dy Dyu); 
we get 
= = =g sin? Bax = a sin? kyAy 
At Ax? 2 Ay? 2 
Introducing 
k, Ax _ kyAy 
Px ME re 
we can write the equation for € more compactly as 
m = —a— sin? p, — «— sin’ py, 
At Ax? í Ay? j 
and solve for &: 
& = 1 — 4F, sin? px — 4F, sin? py. (3.63) 


The complete numerical solution for a wave component is 
un, = A(l — 4F;, sin? py — 4F, sin? py) e EAn tiran) (3.64) 


For stability we demand —1 < € < 1, and —1 < & is the critical limit, since 
clearly £ < 1, and the worst case happens when the sines are at their maximum. 
The stability criterion becomes 


F+ F, < (3.65) 


For the special, yet common, case Ax = Ay = hA, the stability criterion can be 
written as 
h2 
At < EFS 
2da 


where d is the number of space dimensions: d = 1, 2, 3. 
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The Backward Euler scheme The Backward Euler method, 


[Du = a(D, Du + Dy, Dyu) (os 


results in 
1— E = —4F, sin? Dx — 4F, sin? Py» 


and 
E = (1 + 4F sin’ py + 4F, sin? py), 


which is always in (0, 1]. The solution for a wave component becomes 
un, = A(1 + 4F, sin? py + 4F, sin? pp)” e Atra) | (3.66) 
The Crank-Nicolson scheme With a Crank-Nicolson discretization, 


n 1 1 n 
[Diulgr? = s[a(DyDyu + Dy Dyu)’ 


1 
5 + gle(Dx Dau + D; Dyu); 


qr? 
we have, after some algebra, 


_ 1— 2(F; sin? Px F I; sin? Py) 


= 1 + 2(F, sin? py + Fy sin? Py) l 


The fraction on the right-hand side is always less than 1, so stability in the sense 
of non-growing wave components is guaranteed for all physical and numerical pa- 
rameters. However, the fraction can become negative and result in non-physical 
oscillations. This phenomenon happens when 


1 
2 +2 
Fy sinf py + Fy sinf py > k 
A criterion against non-physical oscillations is therefore 


1 
F+ F, <, 
PRSS 


which is the same limit as the stability criterion for the Forward Euler scheme. 
The exact discrete solution is 


1 —2(Fy sin? py + Fesin? py) \" ae 
ugr = i : ae er 2 el lkxqAxtkyrAy) | (3.67) 
1+ 2(F, sin” py + Fy sin” py) 


3.3.10 Explanation of Numerical Artifacts 
The behavior of the solution generated by Forward Euler discretization in time (and 


centered differences in space) is summarized at the end of Sect. 3.1.5. Can we, from 
the analysis above, explain the behavior? 
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We may start by looking at Fig. 3.3 where F = 0.51. The figure shows that 
the solution is unstable and grows in time. The stability limit for such growth is 
F = 0.5 and since the F in this simulation is slightly larger, growth is unavoidable. 

Figure 3.1 has unexpected features: we would expect the solution of the diffusion 
equation to be smooth, but the graphs in Fig. 3.1 contain non-smooth noise. Turning 
to Fig. 3.4, which has a quite similar initial condition, we see that the curves are 
indeed smooth. The problem with the results in Fig. 3.1 is that the initial condition 
is discontinuous. To represent it, we need a significant amplitude on the shortest 
waves in the mesh. However, for F = 0.5, the shortest wave (p = 1/2) gives the 
amplitude in the numerical solution as (1—4F)”, which oscillates between negative 
and positive values at subsequent time levels for F > h Since the shortest waves 
have visible amplitudes in the solution profile, the oscillations becomes visible. 
The smooth initial condition in Fig. 3.4, on the other hand, leads to very small 
amplitudes of the shortest waves. That these waves then oscillate in a non-physical 
way for F = 0.5 is not a visible effect. The oscillations in time in the amplitude 
(1 — 4F)" disappear for F < +, and that is why also the discontinuous initial 
condition always leads to smooth solutions in Fig. 3.2, where F = L, 

Turning the attention to the Backward Euler scheme and the experiments in 
Fig. 3.5, we see that even the discontinuous initial condition gives smooth solu- 
tions for F = 0.5 (and in fact all other F values). From the exact expression of the 
numerical amplitude, (1 + 4F sin? p)~', we realize that this factor can never flip 
between positive and negative values, and no instabilities can occur. The conclu- 
sion is that the Backward Euler scheme always produces smooth solutions. Also, 
the Backward Euler scheme guarantees that the solution cannot grow in time (un- 
less we add a source term to the PDE, but that is meant to represent a physically 
relevant growth). 

Finally, we have some small, strange artifacts when simulating the development 
of the initial plug profile with the Crank-Nicolson scheme, see Fig. 3.7, where 
F = 3. The Crank-Nicolson scheme cannot give growing amplitudes, but it may 
give oscillating amplitudes in time. The critical factor is 1 — 2F sin? p, which for 
the shortest waves (p = 2/2) indicates a stability limit F = 0.5. With the discon- 
tinuous initial condition, we have enough amplitude on the shortest waves so their 
wrong behavior is visible, and this is what we see as small instabilities in Fig. 3.7. 
The only remedy is to lower the F value. 


3.4 Exercises 


Exercise 3.1: Explore symmetry in a 1D problem 
This exercise simulates the exact solution (3.48). Suppose for simplicity that c = 0. 


a) Formulate an initial-boundary value problem that has (3.48) as solution in the 
domain [—L, L]. Use the exact solution (3.48) as Dirichlet condition at the 
boundaries. Simulate the diffusion of the Gaussian peak. Observe that the solu- 
tion is symmetric around x = 0. 

b) Show from (3.48) that ux(c,t) = 0. Since the solution is symmetric around 
x = c = 0, we can solve the numerical problem in half of the domain, using 
a symmetry boundary condition ux = 0 at x = 0. Set up the initial-boundary 
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value problem in this case. Simulate the diffusion problem in [0, L] and compare 
with the solution in a). 


Filename: diffu_symmetric_gaussian. 


Exercise 3.2: Investigate approximation errors from a uy = 0 boundary 
condition 

We consider the problem solved in Exercise 3.1 part b). The boundary condition 
u,(0,f) = 0 can be implemented in two ways: 1) by a standard symmetric finite 
difference [D2,u] = 0, or 2) by a one-sided difference [D*u = 0]! = 0. Investi- 
gate the effect of these two conditions on the convergence rate in space. 


Hint If you use a Forward Euler scheme, choose a discretization parameter h = 
At = Ax? and assume the error goes like E ~ h”. The error in the scheme is 
O(At, Ax?) so one should expect that the estimated r approaches 1. The question 
is if a one-sided difference approximation to u,(0,f) = 0 destroys this convergence 
rate. 

Filename: diffu_onesided_fd. 


Exercise 3.3: Experiment with open boundary conditions in 1D 

We address diffusion of a Gaussian function as in Exercise 3.1, in the domain [0, L], 
but now we shall explore different types of boundary conditions on x = L. In real- 
life problems we do not know the exact solution on x = L and must use something 
simpler. 


a) Imagine that we want to solve the problem numerically on [0, L], with a symme- 
try boundary condition ux = 0 at x = 0, but we do not know the exact solution 
and cannot of that reason assign a correct Dirichlet condition at x = L. One 
idea is to simply set u(L,t) = 0 since this will be an accurate approximation 
before the diffused pulse reaches x = L and even thereafter it might be a satis- 
factory condition if the exact u has a small value. Let ue be the exact solution 
and let u be the solution of u; = au, with an initial Gaussian pulse and the 
boundary conditions ux(0,t) = u(L,t) = 0. Derive a diffusion problem for 
the error e = ue — u. Solve this problem numerically using an exact Dirichlet 
condition at x = L. Animate the evolution of the error and make a curve plot 
of the error measure 


E(t) = 


Is this a suitable error measure for the present problem? 

b) Instead of using u(L,t) = 0 as approximate boundary condition for letting the 
diffused Gaussian pulse move out of our finite domain, one may try ux(L, t) = 
0 since the solution for large ¢ is quite flat. Argue that this condition gives 
a completely wrong asymptotic solution as £ — 0. To do this, integrate the 
diffusion equation from 0 to L, integrate uxx by parts (or use Gauss’ divergence 
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c 


wm 
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theorem in 1D) to arrive at the important property 
j L 
— | u(x,t)dx = 0, 
© f ween 
0 


implying that fie udx must be constant in time, and therefore 


L L 


frana = f tear. 


0 0 


The integral of the initial pulse is 1. 
Another idea for an artificial boundary condition at x = L is to use a cooling 
law 

— Quy = q(u — us), (3.68) 


where q is an unknown heat transfer coefficient and ws is the surrounding 
temperature in the medium outside of [0, L]. (Note that arguing that us is ap- 
proximately u(L,t) gives the ux = 0 condition from the previous subexercise 
that is qualitatively wrong for large t.) Develop a diffusion problem for the error 
in the solution using (3.68) as boundary condition. Assume one can take us = 0 
“outside the domain” since ue —> 0 as x — oo. Find a function q = q(t) such 
that the exact solution obeys the condition (3.68). Test some constant values 
of q and animate how the corresponding error function behaves. Also compute 
E(t) curves as defined above. 


Filename: diffu_open_BC. 


Exercise 3.4: Simulate a diffused Gaussian peak in 2D/3D 


a) 


b) 


c 


wm 


Generalize (3.48) to multi dimensions by assuming that one-dimensional solu- 
tions can be multiplied to solve u; = wV7u. Set c = 0 such that the peak of the 
Gaussian is at the origin. 

One can from the exact solution show that u, = 0 on x = 0, uy =Oony =0, 
and u, = 0 on z = 0. The approximately correct condition u = 0 can be set 
on the remaining boundaries (say x = L, y = L, z = L), cf. Exercise 3.3. 
Simulate a 2D case and make an animation of the diffused Gaussian peak. 

The formulation in b) makes use of symmetry of the solution such that we can 
solve the problem in the first quadrant (2D) or octant (3D) only. To check that 
the symmetry assumption is correct, formulate the problem without symmetry in 
a domain [—L, L] x [L, L] in 2D. Use u = 0 as approximately correct boundary 
condition. Simulate the same case as in b), but in a four times as large domain. 
Make an animation and compare it with the one in b). 


Filename: diffu_symmetric_gaussian_2D. 
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Exercise 3.5: Examine stability of a diffusion model with a source term 
Consider a diffusion equation with a linear u term: 


U, = AUy, + Bu. 


a) Derive in detail the Forward Euler, Backward Euler, and Crank-Nicolson 
schemes for this type of diffusion model. Thereafter, formulate a -rule to 
summarize the three schemes. 

b) Assume a solution like (3.49) and find the relation between a, k, a, and £. 


Hint Insert (3.49) in the PDE problem. 


c) Calculate the stability of the Forward Euler scheme. Design numerical experi- 
ments to confirm the results. 


Hint Insert the discrete counterpart to (3.49) in the numerical scheme. Run exper- 
iments at the stability limit and slightly above. 


d) Repeat c) for the Backward Euler scheme. 
e) Repeat c) for the Crank-Nicolson scheme. 
f) How does the extra term bu impact the accuracy of the three schemes? 


Hint For analysis of the accuracy, compare the numerical and exact amplification 
factors, in graphs and/or by Taylor series expansion. 
Filename: diffu_stability_uterm. 


3.5 Diffusion in Heterogeneous Media 


Diffusion in heterogeneous media normally implies a non-constant diffusion coef- 
ficient a = a(x). A 1D diffusion model with such a variable diffusion coefficient 
reads 


L = 4 (oo 5) + f(x,t), x € (0, L), t € (0,7), (3.69) 
ot Ox Ox 
u(x,0) = I(x), x € [0, L], (3.70) 
u(0,t) = Uo, t >0, (3.71) 
u(L,t) = Ur: t>0. (3.72) 


A short form of the diffusion equation with variable coefficients is u; = (@Ux)x + 
J: 
3.5.1 Discretization 


We can discretize (3.69) by a 0-rule in time and centered differences in space: 


nti 


[Diu]; ? = 0[D.@* Dru) + fy" + A- ODD) + f]. 
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Written out, this becomes 


utt! — u} 1 n+l n+1 
u = Oa (es FH — ut) ~a 
1 
+ (1-5 (144 04h ut) — 0 


a0 ofr! 4 a = Of”, 


n+l n+l 
p14 Us ) 


(u; — a) 


1 
2 


where, e.g., an arithmetic mean can to be used for œ; pi 
2 


1 
OL = 5 Mi + aj41). 


3.5.2 Implementation 


Suitable code for solving the discrete equations is very similar to what we created 
for a constant œ. Since the Fourier number has no meaning for varying a, we 
introduce a related parameter D = At/Ax?. 


def solver_theta(I, a, L, Nx, D, T, theta=0.5, u_L=1, u_R=0, 
user_action=None) : 
x = linspace(0, L, Nx+1) # mesh points in space 
dx = x[1] - x[0] 
dt D*dx**2 
Nt = int (round(T/float (dt) )) 
t = linspace(O, T, Nt+1) # mesh points in time 


(SI 
| 


= zeros(Nx+1) # solution array at t[n+1] 
zeros(Nxt+1)  # solution at t[n] 


[=] 
B 
(i 


D1 = 0.5*D*theta 
0.5*D*(1-theta) 


is] 
8 
[i 


# Representation of sparse matrix and right-hand side 
diagonal = zeros (Nx+1) 


lower = zeros (Nx) 
upper = zeros (Nx) 
b = zeros (Nx+1) 


# Precompute sparse matrix (scipy format) 
diagonal[1:-1] = 1 + Dl*(a[2:] + 2*a[1:-1] + a[:-2]) 
lower[:-1] = -D1l*(a[1:-1] + al:-2]) 

upper [1:] -Dl*(a[2:] + af1:-1]) 

# Insert boundary conditions 

diagonal[0] = 1 

upper [0] = 0 

diagonal [Nx] = 1 

lower[-1] = 0 


A = scipy.sparse.diags( 
diagonals=[diagonal, lower, upper], 
offsets=[0, -1, 1], 
shape=(Nx+1, Nxt+1), 
format=’ csr’) 


3.5 Diffusion in Heterogeneous Media 247 


# Set initial condition 
for i in range(0,Nx+1): 
u nlil = TCH 


if user_action is not None: 
user taction kn. 2 C 0) 


# Time loop 
for n in range(0, Nt): 
b[1:-1] = u_n[1:-1] + Dr*( 
(a[2:] + a[1:-1])*(u_n[2:] - u_n[1:-1]) - 
(a[1:-1] + a[0:-2])*(u_n[1:-1] - u_n[:-2])) 
# Boundary conditions 
b[0] = u_L(t[n+1]) 
b[-1] = u_R(t[n+1]) 
# Solve 
u[:] = scipy.sparse.linalg.spsolve(A, b) 


if user_action is not None: 
user_action(u, x, t, nt+1) 


# Switch variables before next step 
Wo, GS i, HL 


The code is found in the file diffulD_vc.py. 


3.5.3 Stationary Solution 


As t — oo, the solution of the problem (3.69)—(3.72) will approach a stationary 
limit where du/dt = 0. The governing equation is then 


fae Na (3.73) 
—— Q — — . 
dx dx ; 


with boundary conditions u(0) = Up and u(L) = Uz. It is possible to obtain an 
exact solution of (3.73) for any a. Integrating twice and applying the boundary 
conditions to determine the integration constants gives 


So (EdE 


u(x) = Uo + (Uz — Uo) i (3.74) 
MOOM 
3.5.4 Piecewise Constant Medium 
Consider a medium built of M layers. The layer boundaries are denoted bo, .. . , bm, 


where bọ = 0 and by = L. If the layers potentially have different material prop- 
erties, but these properties are constant within each layer, we can express @ as a 
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piecewise constant function according to 


ao, bo < x <b, 


Q 
am 
Se 
ll 
& 
© 
A 


<x < biqi, (3.75) 


au-1, bm <x < bm. 


The exact solution (3.74) in case of such a piecewise constant œ function is easy 
to derive. Assume that x is in the m-th layer: x € [bm,bm4+i]. In the integral 
ia (a(€))~'d& we must integrate through the first m — 1 layers and then add the 
contribution from the remaining part x — bm into the m-th layer: 


Eo (bj41 — bj) /ee(bj) + (x — bm) /a (bm) 
Dito (j+ — b;)/a(b;) 
Remark It may sound strange to have a discontinuous q@ in a differential equation 


where one is to differentiate, but a discontinuous a is compensated by a discontin- 
uous ux such that wv, is continuous and therefore can be differentiated as (au) x. 


u(x) = Uo + (Ur — Uo) (3.76) 


3.5.5 Implementation of Diffusion in a Piecewise Constant Medium 


Programming with piecewise function definitions quickly becomes cumbersome 
as the most naive approach is to test for which interval x lies, and then start 
evaluating a formula like (3.76). In Python, vectorized expressions may help to 
speed up the computations. The convenience classes PiecewiseConstant and 
IntegratedPiecewiseConstant in the Heaviside module were made to sim- 
plify programming with functions like (3.75) and expressions like (3.76). These 
utilities not only represent piecewise constant functions, but also smoothed versions 
of them where the discontinuities can be smoothed out in a controlled fashion. 

The PiecewiseConstant class is created by sending in the domain as a 2-tuple 
or 2-list and a data object describing the boundaries bo,..., bj, and the corre- 
sponding function values a,...,@y_1. More precisely, data is a nested list, 
where data[i] [0] holds b; and data[i] [1] holds the corresponding value a;, 
fori = 0,..., M — 1. Given b; and œ; in arrays b and a, it is easy to fill out the 
nested list data. 

In our application, we want to represent œ and 1/a@ as piecewise constant func- 
tions, in addition to the u(x) function which involves the integrals of 1/a. A class 
creating the functions we need and a method for evaluating u, can take the form 


class SerialLayers: 
woe 
b: coordinates of boundaries of layers, b[0] is left boundary 
and b[-1] is right boundary of the domain [0,L]. 
a: values of the functions in each layer (len(a) = len(b)-1). 
U_O: u(x) value at left boundary x=0=b[0]. 
U_L: u(x) value at right boundary x=L=b[0]. 
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def 


def 


__init__(self, a, b, U_O, U_L, eps=0): 

self.a, self.b = np.asarray(a), np.asarray(b) 
self.eps = eps # smoothing parameter for smoothed a 
self.U_0, self.U_L = U_O, ULL 


a_data = [[bi, ai] for bi, ai in zip(self.b, self.a)] 
domain = [b[0], b[-1]] 
self.a_func = PiecewiseConstant (domain, a_data, eps) 


# inv_a = 1/a is needed in formulas 
inv_a_data = [[bi, 1./ai] for bi, ai in zip(self.b, self.a)] 
self.inv_a_func = \ 
PiecewiseConstant (domain, inv_a_data, eps) 
self .integral_of_inv_a_func = \ 
IntegratedPiecewiseConstant (domain, inv_a_data, eps) 
# Denominator in the exact formula is constant 
self.inv_a_OL = self.integral_of_inv_a_func(b[-1]) 


meall mel HE 

solution = self.U_0 + (self.U_L-self.U_0)*\ 
self.integral_of_inv_a_func (x)/self.inv_a_0L 

return solution 


A visualization method is also convenient to have. Below we plot u(x) along 
with a(x) (which works well as long as max a(x) is of the same size as max u = 
max(Up, Uz)). 


class SerialLayers: 


def 


plot (self): 
x, y_a = self.a_func.plot() 
x = np.asarray(x); y_a = np.asarray(y_a) 
y_u = self.u_exact (x) 
import matplotlib.pyplot as plt 
plt.figure() 
pie plot y uw be) 
plt.hold(’on’) # Matlab style 
prp ot Flay We?) 
ymin = -0.1 
ymax = 1.2*max(y_u.max(), y_a.max()) 
plt.axis([x[0], x[-1], ymin, ymax]) 
plt.legend([’solution $u$’, ’coefficient $a$’], loc=’upper left’) 
if self.eps > 0: 
plt.title(’Smoothing eps: 4s’ % self.eps) 
plt.savefig(’tmp.pdf’) 
plt.savefig(’tmp.png’) 
plt.show() 


Figure 3.12 shows the case where 


w 


O OA Oo l # material boundaries 
HO, Ont, A # material values 
= 0.5; UL= 5 # boundary conditions 
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6r 


— solution u 
— coefficient a 


OF , , , 1 J 
0.0 0.2 0.4 0.6 0.8 1.0 


Fig. 3.12 Solution of the stationary diffusion equation corresponding to a piecewise constant dif- 
fusion coefficient 


Smoothed discontinuous coefficient (eps=0.05) 


— solution u 
— coefficient a 


5+ 


OF f f , 1 J 
0.0 0.2 0.4 0.6 0.8 1.0 


Fig. 3.13 Solution of the stationary diffusion equation corresponding to a smoothed piecewise 
constant diffusion coefficient 


By adding the eps parameter to the constructor of the SerialLayers class, 
we can experiment with smoothed versions of œ and see the (small) impact on u. 
Figure 3.13 shows the result. 
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3.5.6 Axi-Symmetric Diffusion 


Suppose we have a diffusion process taking place in a straight tube with radius R. 
We assume axi-symmetry such that u is just a function of r and t, with r being the 
radial distance from the center axis of the tube to a point. With such axi-symmetry 
it is advantageous to introduce cylindrical coordinates r, 0, and z, where z is in 
the direction of the tube and (r, 9) are polar coordinates in a cross section. Axi- 
symmetry means that all quantities are independent of 0. From the relations x = 
cos 6, y = sin 0, and z = z, between Cartesian and cylindrical coordinates, one can 
(with some effort) derive the diffusion equation in cylindrical coordinates, which 
with axi-symmetry takes the form 


ðu loa du 0 du 
or (raea) + PP (e23) + f(r,z,t). 


Let us assume that u does not change along the tube axis so it suffices to compute 
variations in a cross section. Then ðu/ðz = 0 and we have a 1D diffusion equation 
in the radial coordinate r and time ¢. In particular, we shall address the initial- 
boundary value problem 


a = re (rary) + f(t), r € (0, R), t e (0,7), (3.77) 

dt ror or 
og =0, t e (0, T], (3.78) 
u(R,t) = 0, t €(0,T], (3.79) 
u(r,0) = I(r), r € (0, R]. (3.80) 


The condition (3.78) is a necessary symmetry condition atr = 0, while (3.79) could 
be any Dirichlet or Neumann condition (or Robin condition in case of cooling or 
heating). 

The finite difference approximation will need the discretized version of the PDE 
for r = 0 (just as we use the PDE at the boundary when implementing Neumann 
conditions). However, discretizing the PDE at r = 0 poses a problem because of 
the 1/r factor. We therefore need to work out the PDE for discretization at r = 0 
with care. Let us, for the case of constant «œ, expand the spatial derivative term to 


3u 1 ðu 


a—+a-—. 
ðr? r or 


The last term faces a difficulty at r = 0, since it becomes a 0/0 expression caused 
by the symmetry condition at r = 0. However, L’ Hosptial’s rule can be used: 


The PDE at r = 0 therefore becomes 


ðu 3u 
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For a variable coefficient œ (r) the expanded spatial derivative term reads 


32 1 0 
a(n + =(a(r) + ra) = . 


We are interested in this expression for r = 0. A necessary condition for u to 
be axi-symmetric is that all input data, including œ, must also be axi-symmetric, 
implying that a’(0) = 0 (the second term vanishes anyway because of r = 0). The 
limit of interest is 


. 1 ou o-u 
rm Og = Oga 
The PDE at r = 0 now looks like 
ð 


u 3u 
py = 2e@O aa + FO, (3.82) 


so there is no essential difference between the constant coefficient and variable co- 
efficient cases. 
The second-order derivative in (3.81) and (3.82) is discretized in the usual way. 


2 


0 u” — 2u” u” 
2a zu (ro, tn) ~ [24 D, Dru] = gp ka i 
r 


Ar? 


The fictitious value u”, can be eliminated using the discrete symmetry condition 
1 g y ry 
[Dau = 06 => u =u}, 


which then gives the modified approximation to the term with the second-order 


derivative of u in r atr = 0: 
u? — u” 
I 0 


Ar? 


The discretization of the term with the second-order derivative in r at any internal 
mesh point is straightforward: 


A m FD, raD,u)]? 
ror or 


1 1 
r; Ar? 


4a 


(3.83) 


Q 


n 


n n n 
(natta Up) ~ 710, 1 u; —u!_,)) : 


2 3 
To complete the discretization, we need a scheme in time, but that can be done 

as before and does not interfere with the discretization in space. 

3.5.7 Spherically-Symmetric Diffusion 


Discretization in spherical coordinates Let us now pose the problem from 
Sect. 3.5.6 in spherical coordinates, where u only depends on the radial coordinate 
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r and time ¢. That is, we have spherical symmetry. For simplicity we restrict the 
diffusion coefficient a to be a constant. The PDE reads 


ou Ge 


ot = +7 Or or 


) + f(t), (3.84) 
forr € (0,R) andt € (0,7]. The parameter y is 2 for spherically-symmetric 
problems and | for axi-symmetric problems. The boundary and initial conditions 
have the same mathematical form as in (3.77)—(3.80). 

Since the PDE in spherical coordinates has the same form as the PDE in 
Sect. 3.5.6, just with the y parameter being different, we can use the same dis- 
cretization approach. At the origin r = 0 we get problems with the term 


but L’Hosptial’s rule shows that this term equals yd*u/dr?, and the PDE at r = 0 
becomes 


ðu 3u 
a TO t Deas t+ fo. (3.85) 


The associated discrete form is then 
n+ i 


1 = 
[Da = 507 + DaD, Di + 7| , (3.86) 


L 


for a Crank-Nicolson scheme. 


Discretization in Cartesian coordinates The spherically-symmetric spatial 
derivative can be transformed to the Cartesian counterpart by introducing 


v(r,t) =ru(r,t). 
1 ð 2 Ou 
Ir (ew z) > 


da ðv A a2u da 
r ——v 
dr or Oar dr 


The two terms in the parenthesis can be combined to 
ð ðv 
r—|{a—}]. 
or \ or 


a ( av\ 1d 
Wi eee, eR. SUT: -GkD 
or rdr 


Inserting u = v/r in 


yields 


The PDE for v takes the form 


254 3 Diffusion Equations 
For œ constant we immediately realize that we can reuse a solver in Cartesian co- 
ordinates to compute v. With variable œ, a “reaction” term v/r needs to be added 


to the Cartesian solver. The boundary condition du/dr = 0 at r = 0, implied by 
symmetry, forces v(0, t) = 0, because 


du 1 dv 
ap gl ri Y = 0, r=0. 


3.6 Diffusion in2D 


We now address diffusion in two space dimensions: 


ou u u 
E Ox? 


) + f(x,y), (3.88) 


in a domain 
(x,y) € (0, Lx) x (0, Ly), t € (0,7), 


with u = 0 on the boundary and u(x, y,0) = I(x, y) as initial condition. 


3.6.1 Discretization 


For generality, it is natural to use a @-rule for the time discretization. Standard, 
second-order accurate finite differences are used for the spatial derivatives. We 
sample the PDE at a space-time point (i, j, n + 1) and apply the difference approx- 
imations: 


[D;u]" +? = O[a(D,D,u + D, Dyu) + fP 


+ (1 — @)[a(D,D,u + D, Dyu) + fI”. (3.89) 
Written out, 
upt! = uij 

At 
n+l n+l n+l n+l n+l n+l 
-ola uty = 2u} T With a ue 5 Qui; T uty 4 pnt 
Ax? Ay? hd 
uiy; ~2uP, tue, uP 2u; +e, 
wom (o( SMa a) m) 
(3.90) 


We collect the unknowns on the left-hand side 


n+1 n+l n+l n+l n+l n+l n+l 
uij = 8 (Fut, — 2u; + uipi) + Fy uja —2u;;j + a) 


= (1-6) (Feet; — Quy, + ura) + Ryu j-i 2u7,; + uf 41) 
+ OAT ft + -DAA +0" (3.91) 


ij’ 
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(0,2): 8 (1,2): 9 (2,2): 10 (3,2): 11 


(3,1): 7 


(3,0): 3 


Fig. 3.14 3x2 2D mesh 


where 
aAt aAt 


x = —, F, = —, 
“Ax? Ay? 


are the Fourier numbers in x and y direction, respectively. 


3.6.2 Numbering of Mesh Points Versus Equations and Unknowns 


The equations (3.91) are coupled at the new time level n + 1. That is, we must solve 
a system of (linear) algebraic equations, which we will write as Ac = b, where A 
is the coefficient matrix, c is the vector of unknowns, and b is the right-hand side. 

Let us examine the equations in Ac = b on a mesh with N, = 3 and N, = 2 
cells in the respective spatial directions. The spatial mesh is depicted in Fig. 3.14. 
The equations at the boundary just implement the boundary condition u = 0: 
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We are left with two interior points, withi = 1, j = l andi = 2, j = 1. The 
corresponding equations are 


n+l n+l n+l n+1 n+l n+l n+l 
uij T 0 (muni — 2u; tunai) + Elija 2u + ap) 


= (1—90) (Fiy — 2u; j + uipi) + Puji 2u; j + ut j41)) 
n+l n n 
HOA A-DAL +u; 


iyi 

There are in total 12 unknowns ujt" fori = 0,1,2,3 and 7 = 0,1,2. To 
solve the equations, we need to form a matrix system Ac = b. In that system, 
the solution vector c can only have one index. Thus, we need a numbering of the 
unknowns with one index, not two as used in the mesh. We introduce a mapping 
m(i, j) from a mesh point with indices (i, j ) to the corresponding unknown p in 
the equation system: 


p=m(i,j)= jN +1) +i. 
When i and j run through their values, we see the following mapping to p: 


(0,0) => 0, (0,1) > 1, (0,2) => 2, (0,3) = 3, 
0,0 > 4, (, D) > 5, (1,2) > 6, (1,3) > 7, 
(2,0) > 8, (2,1) > 9, (2,2) > 10, (2,3) > 11. 


That is, we number the points along the x axis, starting with y = 0, and then 
progress one “horizontal” mesh line at a time. In Fig. 3.14 you can see that the 
(i, j ) and the corresponding single index (p) are listed for each mesh point. 

We could equally well have numbered the equations in other ways, e.g., let the 
j index be the fastest varying index: p = m(i, j) = i (N, +1) + j. 

Let us form the coefficient matrix A, or more precisely, insert a matrix element 
(according Python’s convention with zero as base index) for each of the nonzero 


elements in A (the indices run through the values of p, i.e., p = 0,..., 11): 
(0,0) 0 0 0 0 0 0 0 0 0 0 0 
0 dtd 0 0 0 0 0 0 0 0 0 0 
0 0 (2,2) 0 0 0 0 0 0 0 0 0 
0 0 0 (3,3) 0 0 0 0 0 0 0 0 
0 0 0 0 (4,4) 0 0 0 0 0 0 0 
0 61) 0 0 (5,4) (5,5) (5,6) 0 0 (5,9) 0 0 
0 0 62) 0 0 (6,5) (6,6) (6,7) 0 0 (6,10) 0 
0 0 0 0 0 0 0 a) 0 0 0 0 
0 0 0 0 0 0 0 0 (88) 0 0 0 
0 0 0 0 0 0 0 0 0 (9,9) 0 0 
0 0 0 0 0 0 0 0 0 o (10, 10) 0 
0 0 0 0 0 0 0 0 0 0 0 (11,11) 
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Here is a more compact visualization of the coefficient matrix where we insert dots 
for zeros and bullets for non-zero elements: 


It is clearly seen that most of the elements are zero. This is a general feature of 
coefficient matrices arising from discretizing PDEs by finite difference methods. 
We say that the matrix is sparse. 

Let Ap q be the value of element (p, q) in the coefficient matrix A, where p and 
q Now correspond to the numbering of the unknowns in the equation system. We 
have A,g = 1 for p = q = 0,1,2,3,4,7,8, 9, 10, 11, corresponding to all the 
known boundary values. Let p be m(i, j), i.e., the single index corresponding to 
mesh point (i, j). Then we have 


Am(i,j)m(i.j) = App = 1+ O(F + F), (3.92) 
Apm(—1,j) = Ap,p-1 = —0 Fx, (3.93) 
Ap m(i+1,j) = App+i = —9F;, (3.94) 
Apm(i,j—-1) = Ap,p-(Mx +1) = —OFy, (3.95) 


II 

| 

D 
te 


Apm j+1) = Áp, p+(N;+1) (3.96) 
for the equations associated with the two interior mesh points. At these interior 
points, the single index p takes on the specific values p = 5, 6, corresponding to 
the values (1, 1) and (1,2) of the pair (i, j ). 

The above values for A, q can be inserted in the matrix: 


1 0 0 0 o0 0 0 0 0 0 0 0 
0 1 0 0 o0 0 0 0 0 0 0 o0 
0 o0 1 0 0 0 0 0 0 0 0 0 
0 o0 o 1 0 0 0 0 0 0 0 o0 
0 o0 0 0 1 0 0 0 0 o0 0 0 
0 -O0F, 0 0 —OF, 1+20F, OF, 0 0 -0F, 0 0 
0 0 -0F, 0 0 OF, 1+20F, -0F, 0 0 -6F, 0 
0 0 0 0 o0 0 0 1 0 0 0 0 
0 0 0 0 0 0 0 o 1 0 0 0 
0 0 0 0 o0 0 0 0 0 1 0 0 
0 0 0 0 0 0 0 0 0 0 1 0 
0 0 0 0 0 0 0 0 0 0 0 1 
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The corresponding right-hand side vector in the equation system has the entries b,, 
where p numbers the equations. We have 


bo = bı = b, = b; = b4 = b; = bg = bọ = bio = bi = 0, 


for the boundary values. For the equations associated with the interior points, we 
get for p = 5,6, corresponding to i = 1,2 and j = 1: 


bp = ul, + (1-8) (Feu; — 20h; tty) + Ft) — 208; + ut 4) 
n+l n 
+ OArf"t! + (1— Arf". 


Recall that p = m(i, j) = j(Ny + 1) + j in this expression. 

We can, as an alternative, leave the boundary mesh points out of the matrix 
system. For a mesh with N, = 3 and N, = 2 there are only two internal mesh 
points whose unknowns will enter the matrix system. We must now number the 
unknowns at the interior points: 


p= -1)4.-1) +4, 


fori = 1,...,N,—1, j =1,...,N)—1. 

We can continue with illustrating a bit larger mesh, VN, = 4 and N, = 3, see 
Fig. 3.15. The corresponding coefficient matrix with dots for zeros and bullets for 
non-zeroes looks as follows (values at boundary points are included in the equation 
system): 


The coefficient matrix is banded 

Besides being sparse, we observe that the coefficient matrix is banded: it has five 
distinct bands. We have the diagonal A; ;, the subdiagonal A;_, ;, the superdiag- 
onal A; ;41, a lower diagonal A; ;—~(vx41), and an upper diagonal A; i+(Nx+1)- 
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(0,3): 15 (1,3): 16 (2,3): 17 (3,3): 18 (4,3): 19 
(0,2): 10 (4,2): 14 
(0,1): 5 (4,1): 9 
(0,0): 0 (4,0): 4 


Fig. 3.15 4x3 2D mesh 


The other matrix entries are known to be zero. With Ny +1 = Ny +1=N, 
only a fraction 5N~? of the matrix entries are nonzero, so the matrix is clearly 
very sparse for relevant N values. The more we can compute with the nonzeros 
only, the faster the solution methods will potentially be. 


3.6.3 Algorithm for Setting Up the Coefficient Matrix 


We looked at a specific mesh in the previous section, formulated the equations, and 
saw what the corresponding coefficient matrix and right-hand side are. Now our 
aim is to set up a general algorithm, for any choice of Ny and N,, that produces the 
coefficient matrix and the right-hand side vector. We start with a zero matrix and 
vector, run through each mesh point, and fill in the values depending on whether the 
mesh point is an interior point or on the boundary. 


e fori =0,..., Ny 
- for j =0,...,N, 
x p= j(N.+1)+i 
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* if point (i, j ) is on the boundary: 
A,» =1, bp =0 
* else: 
fill Ap m(i-1,j)> Ap m(i41,j)> Ap miij)» Ap m(i,j—1)> Ap m(i,j +1)> and bp 


To ease the test on whether (i, j ) is on the boundary or not, we can split the loops a 
bit, starting with the boundary line j = 0, then treat the interior lines 1 < j < N,, 
and finally treat the boundary line j = Ny: 


e fori =0,..., N; 
- boundary j = 0: p = j(Ny +1) +i, Ap, = 1 
e forj =0,...,N, 
- boundary i = 0: p = j(N, + 1)+i, AĄpp= 1 
- fori = 1,..., N; — 1 
x interior point p = j(N, + 1)+i 
* fill Apma-1.j) Apm(i+1.j)» Apmij) pmi j-)> Apm j+ and bp 
— boundary i = Ny: p = j(Nx + 1)+i, Ap, = 1l 
e fori =0,..., N; 
— boundary j = Ny: p = j(Ny +1) +i, App = 1 


The right-hand side is set up as follows. 


e fori = 0,..., Ny 
- boundary j = 0: p = j(N,y +1) +i,b, =0 
e forj =0,..., N; 
- boundary i = 0: p = j(N, + 1)+i, b, =0 
- fori = 1,..., N; — 1 
* interior point p = j(N,+1)+i 
* fill b, 
- boundary i = Ny: p = j(Ny +1) +i, b, =0 
e fori =0,...,Ny 
- boundary j = Ny: p = j(Ny +1) +i,b, =0 


3.6.4 Implementation with a Dense Coefficient Matrix 


The goal now is to map the algorithms in the previous section to Python code. 
One should, for computational efficiency reasons, take advantage of the fact that 
the coefficient matrix is sparse and/or banded, i.e., take advantage of all the zeros. 
However, we first demonstrate how to fill an N x N dense square matrix, where 
N is the number of unknowns, here N = (N, + 1)(N, + 1). The dense matrix is 
much easier to understand than the sparse matrix case. 


import numpy as np 


def solver_dense( 
I, a, f, Lx, Ly, Nx, Ny, dt, T, theta=0.5, user_action=None) : 
nnn 
Solve u_t = a*(u_xx + u_yy) + f, u(x,y,0)=I(x,y), with u=0 
on the boundary, on [0,Lx]x[0,Ly]x[0,T], with time step dt, 


using the theta-scheme. 
nun 
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x = np.linspace(0, Lx, Nx+1) # mesh points in x dir 
y = np.linspace(0, Ly, Ny+1) # mesh points in y dir 
dx = x[1] - x[0] 

dy = y[1] - y[0] 

dt = float (dt) # avoid integer division 
Nt = int (round(T/float (dt) )) 


t = np.linspace(0, Nt*dt, Nt+1)  # mesh points in time 


# Mesh Fourier numbers in each direction 
Fx = a*dt/dx**2 
Fy = axdt/dy**2 


The Ne and u; ; mesh functions are represented by their spatial values at the mesh 
points: 


(= 
I 


= np.zeros((Nx+1, Ny+1)) # unknown u at new time level 
np.zeros((Nx+1, Ny+1)) # u at the previous time level 


=] 
B 
il} 


It is a good habit (for extensions) to introduce index sets for all mesh points: 


Ix = range(0, Nx+1) 
Iy = range(0, Ny+1) 
It = range(0, Nt+1) 


The initial condition is easy to fill in: 


# Load initial condition into u_n 
for i in Iz: 
for j in Iy: 
mal] = weil), sli) 


The memory for the coefficient matrix and right-hand side vector is allocated by 


N = (Nxt+1)*(Ny+1) # no of unknowns 
A = np.zeros((N, N)) 
b = np.zeros(N) 


The filling of A goes like this: 


m = lambda i, j: j*(Nx+1) + i 


# Equations corresponding to j=0, i=0,1,... (u known) 
j= 0 
for i in Ix: 

p=m(i,j); Alp, w =1 
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# Loop over all internal 
# and all mesh points in 
aoe 4) ail iy EEE 
ico yo) Ee Gt j): 
for iin e E 
p = m(i,j) 
Alp, m(i,j-1)] 
Alp, m(i-1,j)] 
Alp, pl 
Alp, m(i+1,j)] 
A(p, m(i,jt+1)] 
d c Nx; p> mG a); 
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mesh points in y diretion 
x direction 


Alp, p] = 1 # Boundary 
# Interior points 


= - theta*Fy 


- theta+Fx 

1 + 2*theta*(Fx+Fy) 

- theta*Fx 

- theta*Fy 

Alp, p] = 1 # Boundary 


# Equations corresponding to j=Ny, i=0,1,... (u known) 


J © My 
for ah, alge Iz: 
p=m(i,j); Alp, pl 


= 1 


Since A is independent of time, it can be filled once and for all before the time loop. 
The right-hand side vector must be filled at each time level inside the time loop: 


import scipy.linalg 


forn in it loc 
# Compute b 
j=0 
for i in Ix: 


p =m(i,j); blp] =0 # Boundary 


for j ave me 


i= 0; p= m(i,j); bp] = 0 # Boundary 
for iin ESE # Interior points 


p = m(i,j) 


b[p] = u_n[i, 


JIRE 


(1-theta)*( 
pen eal i - 2*u_n[i,j] + u_n[i-1,j]) +\ 
Fy*(u_n[i,j+1] - 2*u_n[i,j] + u_n[i,j-1]))\ 
+ thetaxdt*f(i*dx, j*dy, (n+1)*dt) + \ 
(1-theta) *dt*f (ixdx, j*dy,n*dt) 
i= Nx; p=m(i,j); bl[p] = 0 # Boundary 


J = ny 
Torarin T: 
p = m(i,j); bip] = 0 # Boundary 


# Solve matrix system A*c = b 
c = scipy.linalg.solve(A, b) 


# Fill u with vector c 
for i in Ix: 
for j in Iy: 
uli,j] = clm(i,j)] 


# Update u_n before next step 
un, wu =u, un 


We use solve from scipy.1linalg and not from numpy.linalg. The difference 


is stated below. 
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scipy.linalg versus numpy.linalg 
Quote from the SciPy documentation’: 

scipy.linalg contains all the functions in numpy.1inalg plus some other 
more advanced ones not contained in numpy. linalg. 

Another advantage of using scipy.1linalg over numpy.linalg is that it is al- 
ways compiled with BLAS/LAPACK support, while for NumPy this is optional. 
Therefore, the SciPy version might be faster depending on how NumPy was in- 
stalled. 

Therefore, unless you don’t want to add SciPy as a dependency to your 
NumPy program, use scipy.1linalg instead of numpy.linalg. 


The code shown above is available in the solver_dense function in the file 
diffu2D_u0. py, differing only in the boundary conditions, which in the code can 
be an arbitrary function along each side of the domain. 

We do not bother to look at vectorized versions of filling A since a dense matrix 
is just used of pedagogical reasons for the very first implementation. Vectorization 
will be treated when A has a sparse matrix representation, as in Sect. 3.6.7. 


How to debug the computation of A and b 

A good starting point for debugging the filling of A and b is to choose a very 
coarse mesh, say VN, = N, = 2, where there is just one internal mesh point, 
compute the equations by hand, and print out A and b for comparison in the 
code. If wrong elements in A or b occur, print out each assignment to elements 
in A and b inside the loops and compare with what you expect. 


To let the user store, analyze, or visualize the solution at each time level, we 
include a callback function, named user_action, to be called before the time loop 
and in each pass in that loop. The function has the signature 


user_action(u, x, xv, y, yv, t, n) 


where u is a two-dimensional array holding the solution at time level n and time 
t [n]. The x and y coordinates of the mesh points are given by the arrays x and y, 
respectively. The arrays xv and yv are vectorized representations of the mesh points 
such that vectorized function evaluations can be invoked. The xv and yv arrays are 
defined by 


xv = x[:,np.newaxis] 
yv = y[np.newaxis, :] 


One can then evaluate, e.g., f(x, y,¢) at all internal mesh points at time level n by 
first evaluating f at all points, 


f_a = f(xv, yv, t[n)) 


? http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html 
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and then use slices to extract a view of the values at the internal mesh points: 
f_a[1:-1,1:-1]. The next section features an example on writing auser_action 
callback function. 


3.6.5 Verification: Exact Numerical Solution 


A good test example to start with is one that preserves the solution u = 0, i.e., 
f = 0and I(x, y) = 0. This trivial solution can uncover some bugs. 

The first real test example is based on having an exact solution of the discrete 
equations. This solution is linear in time and quadratic in space: 


u(x, y,t) = 5tx(Lx —x)y(y — Ly). 


Inserting this manufactured solution in the PDE shows that the source term f must 
be 


f(x, y,t) = 5x(Lx — x)y(y — Ly) + 10at(x(Lx — x) + yO — Ly)). 


We can use the user_action function to compare the numerical solution with 
the exact solution at each time level. A suitable helper function for checking the 
solution goes like this: 


def quadratic(theta, Nx, Ny): 


def u_exact(x, y, t): 
return 5*t*x* (Lx-x) *y* (Ly-y) 
def I(x, y): 
return u_exact(x, y, 0) 
lene (Ge, ay WS 
return 5*x*(Lx-x)*y*(Ly-y) + 10*a*t*(y*(Ly-y)+x*(Lx-x)) 


# Use rectangle to detect errors in switching i and j in scheme 
Lx = 0.75 

Ly = 1.5 

a= 3.5 


def assert_no_error(u, x, xv, y, yv, t, n): 
"""Assert zero error at all mesh points.""" 
u_e = u_exact(xv, yv, t[n]) 
diff = abs(u - u_e).max() 
toll = s1E=12 
msg = ’diff=%g, step 4d, time=%g’ % (diff, n, t[n]) 
print msg 
assert diff < tol, msg 


solver_dense( 
Poart Ez Ey Nees Nys 
dt, T, theta, user_action=assert_no_error) 


A true test function for checking the quadratic solution for several different meshes 
and @ values can take the form 


3.6 Diffusion in 2D 265 


def test_quadratic(): 

# For each of the three schemes (theta = 1, 0.5, 0), a series of 

# meshes are tested (Nx > Ny and Nx < Ny) 

for theta in iin Ono nm Ole: 

for Nx in range(2, 6, 2): 
for Ny in range(2, 6, 2): 

print ’testing for /%dx/d mesh’ % (Nx, Ny) 
quadratic(theta, Nx, Ny) 


3.6.6 Verification: Convergence Rates 


For 2D verification with convergence rate computations, the expressions and com- 
putations just build naturally on what we saw for 1D diffusion. Truncation error 
analysis and other forms of error analysis point to a numerical error formula like 


E=C,At? + C,Ax* + C,Ay’, 


where p, C;, Cx, and C, are constants. Often, the analysis of a Crank-Nicolson 
method can show that p = 2, while the Forward and Backward Euler schemes have 
p=l. 

When checking the error formula empirically, we need to reduce it toa form E = 
Ch’ with a single discretization parameter h and some rate r to be estimated. For 
the Backward Euler method, where p = 1, we can introduce a single discretization 
parameter according to 


h=Ax?=Ay’, h=K"'At, 
where K is aconstant. The error formula then becomes 
E=C,Kh+C,h+Cjh=Ch, C=C,K+C,+G. 


The simplest choice is obviously K = 1. With the Forward Euler method, however, 
stability requires At = hK <h/(4a), so K < 1/(4a). 
For the Crank-Nicolson method, p = 2, and we can simply choose 


h = Ax = Ay = At, 


since there is no restriction on Aft in terms of Ax and Ay. 

A frequently used error measure is the £? norm of the error mesh point val- 
ues. Section 2.2.3 and the formula (2.26) shows the error measure for a 1D time- 
dependent problem. The extension to the current 2D problem reads 


Rie 


Ni Nx Ny 


E = | AtAxAy a > X uei, Yjstn) — uj) 
n=0 i=0 j=0 


One attractive manufactured solution is 


Ue = e” sin(kxx)sin(kyy), ky = = 
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where p can be arbitrary. The required source term is 
f = (alk, + ky) — pue. 


The function convergence_rates in diffu2D_u0.py implements a conver- 
gence rate test. Two potential difficulties are important to be aware of: 


1. The error formula is assumed to be correct when h — 0, so for coarse meshes 
the estimated rate r may be somewhat away from the expected value. Fine 
meshes may lead to prohibitively long execution times. 

2. Choosing p = a(k2 + k?) in the manufactured solution above seems attractive 
(f = 0), but leads to a slower approach to the asymptotic range where the error 
formula is valid (1.e., r fluctuates and needs finer meshes to stabilize). 


3.6.7 Implementation with a Sparse Coefficient Matrix 


We used a sparse matrix implementation in Sect. 3.2.2 fora 1D problem with a tridi- 
agonal matrix. The present matrix, arising from a 2D problem, has five diagonals, 
but we can use the same sparse matrix data structure scipy.sparse.diags. 


Understanding the diagonals Let us look closer at the diagonals in the example 
with a 4 x 3 mesh as depicted in Fig. 3.15 and its associated matrix visualized by 
dots for zeros and bullets for nonzeros. From the example mesh, we may generalize 
to an Ny x N, mesh. 


0 = m(0,0) e . 

1 = m(1,0) _ ee o 

2 = m(2,0) © > 6e . 

3 = m(3,0) T 

Ny = m(Nx,0) T 

N; + 1 = m(0, 1) mo & © Gec@ S. Fo Go ww wo 

(Ny +1) + 1=m(1, 1) - .— . . o ooo.. . o.l 

(N, +1) + 2 = m(2, 1) -o ©. . . © oo. . . o 

(N, +1) +3 =m(3, 1) >o © . . o ooo. `. . o 

(N, +1) + Ny = m(N,, 1) ee ee ee ae a a 

2(Ny + 1) = m(0,2) Boe ede. bo ee Oe a ee oe OOO. OS 

2(Ny +1) +1 = m(1,2) Borde Mes hee Mee tae fee. ae Ges Sek SBE ee Br Se Se ote 
2(Ny + 1) +2 =m(2,2) Bo ee te te eR we Bw Le. Gee, le, Sah cw SO. OS ee, 2 
2(Ny + 1) + 3 = m(3,2) Pg extn i orden ee en igh, RR eit ee ee cae fe 
2(N; + 1) + Ny = m(Ny, 2) RS he g Bee es ea es tt wee A 

N, (N; + 1) = m(0, N,) oe R e ee eet SS e r ee 

N,(Ny +1) +1 =m(1, Ny) Re 8h. 8 ok. An a tn cm ON S Gs Gt ce ge 
Ny(Ny +1) +2 = m(2, Ny) Ce ae ee ee ee ee, ea eo NS 
N, (Ny + 1) +3 = m(3, Ny) Be he ee Gs ee ns on i SG et Oe Ot ee 
Ny (Ny FIDE Ny = m(Ny Ny) e a a a a e e a a a a A G G a 


The main diagonal has N = (N, + 1)(N, + 1) elements, while the sub- and 
super-diagonals have N — 1 elements. By looking at the matrix above, we realize 
that the lower diagonal starts in row N, + 1 and goes to row N, so its length 
is N — (N, + 1). Similarly, the upper diagonal starts at row O and lasts to row 
N — (N, + 1), so it has the same length. Based on this information, we declare the 
diagonals by 
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main = np.zeros(N) # diagonal 

lower = np.zeros(N-1) # subdiagonal 
upper = np.zeros(N-1) # superdiagonal 
lower2 = np.zeros(N-(Nx+1)) # lower diagonal 
upper2 = np.zeros(N-(Nx+1)) # upper diagonal 
b = np.zeros(N) # right-hand side 


Filling the diagonals We run through all mesh points and fill in elements on the 
various diagonals. The line of mesh points corresponding to 7 = 0 are all on the 
boundary, and only the main diagonal gets a contribution: 


m = lambda i, j: j*(Nx+1) + i 
0; main[m(0,j):m(Nxt+1,j)] = 1 # j=0 boundary line 


ey 
I 


Then we run through all interior j = const lines of mesh points. The first and the 
last point on each line, i = 0 andi = N,, correspond to boundary points: 


for j in Iy[1:-1]: # Interior mesh lines j=1,...,Ny-1 
i = 0; main[m(i,j)] 1 
i = Nx; mainima 1 # Boundary 


For the interior mesh points i = 1,...,N, — 1 on a mesh line y = const we can 
start with the main diagonal. The entries to be filled go from į = 1 toi = N, — 1 
so the relevant slice in the main vector is m(1,j) :m(Nx,j): 


main[m(1,j):m(Nx,j)] = 1 + 2*theta*(Fx+Fy) 


The upper array for the superdiagonal has its index 0 corresponding to row 0 in the 
matrix, and the array entries to be set go from m(1, j) to m(N, — 1, 7): 


upper [m(1,j) :m(Nx,j)] = - theta*Fx 


The subdiagonal (lower array), however, has its index 0 corresponding to row 1, so 
there is an offset of 1 in indices compared to the matrix. The first nonzero occurs 
(interior point) at a mesh line j = const corresponding to matrix row m(1, j), and 
the corresponding array index in lower is then m(1, j). To fill the entries from 
m(1, j) to m(N, — 1, 7) we set the following slice in lower: 


lower_offset = 1 
lower [m(1, j)-lower_offset:m(Nx,j)-lower_offset] = - theta*Fx 


For the upper diagonal, its index 0 corresponds to matrix row 0, so there is no 
offset and we can set the entries correspondingly to upper: 


upper2[m(1,j):m(Nx,j)] = - theta*Fy 
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The lower2 diagonal, however, has its first index 0 corresponding to row N, + 1, 
so here we need to subtract the offset Ny + 1: 


lower2_offset = Nx+1 
lower2[m(1,j)-lower2_offset:m(Nx,j)-lower2_offset] = - theta*Fy 


We can now summarize the above code lines for setting the entries in the sparse 
matrix representation of the coefficient matrix: 


lower_offset = 1 
lower2_offset = Nx+1 
m = lambda i, j: j*(Nx+1) + i 


j = 0; main[m(0,j):m(Nx+1,j)] = 1 # j=0 boundary line 

for j ina iyli: 1: # Interior mesh lines j=1,...,Ny-1 
i=0; main[m(i,j)] = 1 # Boundary 
i = Nx; main[m(i,j)] 1 # Boundary 


# Interior i points: i=1,...,N_x-1 
lower2[m(1,j)-lower2_offset:m(Nx,j)-lower2_offset] = - theta*Fy 
lower [m(1,j)-lower_offset:m(Nx,j)-lower_offset] = - theta*Fx 
main[m(1,j):m(Nx,j)] = 1 + 2*theta*(Fx+Fy) 

upper [m(1,j):m(Nx,j)] = - theta*Fx 

upper2[m(1,j):m(Nx,j)] = - theta*Fy 


j = Ny; main[m(0,j):m(Nx+1,j)] = 1 # Boundary line 


The next task is to create the sparse matrix from these diagonals: 


import scipy.sparse 


A = scipy.sparse.diags( 
diagonals=[main, lower, upper, lower2, upper2], 
offsets=[0, -lower_offset, lower_offset, 
-lower2_offset, lower2_offset], 
shape=(N, N), format=’csr’) 


Filling the right-hand side; scalar version Setting the entries in the right-hand 
side is easier, since there are no offsets in the array to take into account. The right- 
hand side is in fact similar to the one previously shown, when we used a dense 
matrix representation (the right-hand side vector is, of course, independent of what 
type of representation we use for the coefficient matrix). The complete time loop 
goes as follows. 


import scipy.sparse.linalg 


anohe ia alyal it OSE 
# Compute b 
j= 0 
for i in Ix: 
p =m(i,j); blp] =0 # Boundary 
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forj in tyi: k 


i=0; p=m(i,j); aa =0 # Boundary 
for al aniz iei 
p = m(i,j) # Interior 


b[p] = u_n[i,j] + \ 
(1-theta) *( 
Fx*(u_n[it1,j] - 2*u_n[i,j] + u_n[i-1,j]) +\ 
Fy*(u_n[i,j+1] - 2*u_n[i,j] + u_nfi,j-1]))\ 
+ thetaxdt*f(i*dx,j*dy,(n+1)*dt) + \ 
(1-theta) *dt*f (ixdx, j*dy,n*dt) 


i= Nx; p=m(i,j); blp] = 0 # Boundary 
J = Ny 
for i in bx: 

p=m(i,j); blp] =0 # Boundary 


# Solve matrix system A*c = b 
c = scipy.sparse.linalg.spsolve(A, b) 


# Fill u with vector c 
for i in Ix: 
for jini iy: 
uli,j] = eE 


# Update u_n before next step 
un, u =u, u_n 


Filling the right-hand side; vectorized version. Since we use a sparse matrix 
and try to speed up the computations, we should examine the loops and see if some 
can be easily removed by vectorization. In the filling of A we have already used 
vectorized expressions at each j = const line of mesh points. We can very easily 
do the same in the code above and remove the need for loops over the i index: 


forintini Ine [KOR—al]] 8 
# Compute b, vectorized version 


# Precompute f in array so we can make slices 
f_a_npi = f(xv, yv, t[n+1]) 
fan = f(xv, yv, t[n]) 


j = 0; b[m(0,j):m(Nx+1,j)] = 0 # Boundary 
forj in yiee E 
i=0; p= m(i,j); bip] = 
i =Nx; p=mG,j); bp 
imin = Ix[1] 
imax = Ix[-1] # for slice, max i index is Ix[-1]-1 
b[m(imin, j):m(imax,j)] = u_n[imin:imax,j] + \ 
(1-theta) * (Fx ( 
u_n[imin+1:imax+1,j] - 
2*u_n[imin:imax,j] + \ 
u_n[imin-1:imax-1,j]) + 
Fy*( 
u_n[imin:imax,j+1] - 
2*u_n[imin:imax,j] + 
u_n[imin:imax,j-1])) + \ 
theta*dt*f_a_np1[imin:imax,j] + \ 
(1-theta) *dt*f_a_n[imin: imax, j] 
j = Ny; bl[m(0O,j):m(Nxt+1,j)] = O # Boundary 


Boundary 


0 # 
0 # Boundary 
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# Solve matrix system A*c = b 
c = scipy.sparse.linalg.spsolve(A, b) 


# Fill u with vector c 
u[:,:] = c.reshape(Nyt+1,Nx+1).T 


# Update u_n before next step 
u_n, U =u, u_n 


The most tricky part of this code snippet is the loading of values from the one- 
dimensional array c into the two-dimensional array u. With our numbering of 
unknowns from left to right along “horizontal” mesh lines, the correct reordering 
of the one-dimensional array c as a two-dimensional array requires first a reshap- 
ing to an (Ny+1,Nx+1) two-dimensional array and then taking the transpose. The 
result is an (Nx+1,Ny+1) array compatible with u both in size and appearance of 
the function values. 

The spsolve function in scipy.sparse.linalg is an efficient version of 
Gaussian elimination suited for matrices described by diagonals. The algorithm is 
known as sparse Gaussian elimination, and spsolve calls up a well-tested C code 
called SuperLU?. 

The complete code utilizing spsolve is found in the solver_sparse function 
in the file diffu2D_u0.py. 


Verification We can easily extend the function quadratic from Sect. 3.6.5 to 
include a test of the solver_sparse function as well. 


def quadratic(theta, Nx, Ny): 


t, cpu = solver_sparse( 
py Gia aig Ie, Jy Nee, Iiia 
dt, T, theta, user_action=assert_no_error) 


3.6.8 The Jacobi Iterative Method 


So far we have created a matrix and right-hand side of a linear system Ac = b and 
solved the system for c by calling an exact algorithm based on Gaussian elimination. 
A much simpler implementation, which requires no memory for the coefficient ma- 
trix A, arises if we solve the system by iterative methods. These methods are only 
approximate, and the core algorithm is repeated many times until the solution is 
considered to be converged. 


Numerical scheme and linear system To illustrate the idea of the Jacobi method, 
we simplify the numerical scheme to the Backward Euler case, 0 = 1, so there are 


3 http://crd-legacy.lbl.gov/~xiaoye/SuperLU/ 
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fewer terms to write: 
n+1 n+l n+l n+l n+l n+1 n+l 
Wip = ay (Cae = 2u; + ia) + Fy (ee = 2u; + d) 
HAAY. 


(3.97) 
The idea of the Jacobi iterative method is to introduce an iteration, here with in- 


dex r, where we in each iteration treat ult! as unknown, but use values from the 


ij 
previous iteration for the other unknowns u” in jar 


Iterations Let upy be the approximation to u 


n+ 
i,j 


in iteration r, for all relevant i 


and j indices. We first solve with respect to u; ; to get the equation to solve: 


vil = + 28 2E (F (uf, + uhh) + 6 (uth + afl) 


4 u” jt Atf. a , 
(3.98) 
The iteration is introduced by using iteration index r, for computed values, on the 
right-hand side and r + 1 (unknown in this iteration) on the left-hand side: 


ee — a +2F, +2F,)! (A (ag + a) + F, (u aan + wiv) 


tub, + Ath. (3.99) 


Initial guess We start the iteration with the computed values at the previous time 

level: 
+1,0 ; ; 

ui j =u; 1=0,...,Nx, j =0,..., Ny. (3.100) 

Relaxation A common technique in iterative methods is to introduce a relaxation, 

which means that the new approximation is a weighted mean of the approximation 

as suggested by the algorithm and the previous approximation. Naming the quantity 

on the left-hand side of (3.99) as upt” , a new approximation based on relaxation 
reads 

u” +lr+1 = wu Ta * 4 a- oju,” (3.101) 


Under-relaxation means w < 1, while eer ae has w > 1. 


Stopping criteria The iteration can be stopped when the change from one iteration 
to the next is sufficiently small (< €), using either an infinity norm, 


max gee — utti <€, (3.102) 
ij J J 
or an L? norm, 
n+l, r+1 n+1,r\2 
AxAy > Uui; =u; ) <e. (3.103) 


Another widely used criterion measures how well the equations are solved by 
looking at the residual (essentially b — Ac’*! if c’+! is the approximation to the 
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solution in iteration r + 1). The residual, defined in terms of the finite difference 
stencil, is 


— ftir! n+l,r+1 n+l,r+1 n+l,r+1 
Rij =u; — (« ‘Gane —2u;; + Wisi ) 


n+l1,r+1 n+l1,r+1 n+1,r+1 
+ Fy (as — 2u;; + Ui jy )) 


—ul Ai f (3.104) 


One can then iterate until the norm of the mesh function R;,; is less than some 
tolerance: 


ie 


AxAy Ý R, | <e. (3.105) 
i,j 


Code-friendly notation To make the mathematics as close as possible to what we 


will write in a computer program, we may introduce some new notation: u;,j is 
n+l,r+1 = n+l,r 


a short notation for u; j s Ujj is a short notation for u; j > and us? denotes 
upt, That is, u;,; is the unknown, u7 , is its most recently computed approxima- 


tion, and s counts time levels backwards in time. The Jacobi method (3.99) takes 
the following form with the new notation: 


U; = (1 +2F, + p| (Fler + Wisi) + F, 7j + T) 
+u) + a) 


(3.106) 


Generalization of the scheme We can also quite easily introduce the 6 rule for 
discretization in time and write up the Jacobi iteration in that case as well: 


uj = (1 + 20(F, + F)! ¢ (Feu, F Ui41,;) + Fy (uj j-i T Ne) 
+u® + oaf + A- OASE, 
+ (1 O (F(u, — 2ui + uag) 
(1) a) a 
+ F(u; jy — 2u; j + ia) i 


(3.107) 
The final update of u applies relaxation: 


uij = wu} j +(L—@)u;;. 
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3.6.9 Implementation of the Jacobi Method 


The Jacobi method needs no coefficient matrix and right-hand side vector, but it 
needs an array for u in the previous iteration. We call this array u_, using the 
notation at the end of the previous section (at the same time level). The unknown 
itself is called u, while u_n is the computed solution one time level back in time. 
With a 0 rule in time, the time loop can be coded like this: 


forin inmi TETOS NE 
# Solve linear system by Jacobi iteration at time level n+1 
ukol — ene Start valus 
converged = False 


r=0 
while not converged: 
if version == ’scalar’: 
j=0 
for iin Tz: 
uli,j] = U_Oy(t[n+1]) # Boundary 


ier yf ate, y eeN 
i = 0; u[i,j] = U_Ox(t[n+1]) # Boundary 
i, = iba mE U_Lx(t[n+1]) # Boundary 
# Interior points 
for i in te|filseal]] 
u_new = 1.0/(1.0 + 2*theta*(Fx + Fy))*(theta*( 
adn [eral ail) sp Wh [ital a se 
EAC [typ] sp WL |Est aj —ablp))) se \\ 
Linh gl] se N 
(1-theta) * (Fx ( 
u_n[i+1,j] - 2*u_n[i,j] + u_n[i-1,j]) + 
Fy*( 
Halak, jar) = Pram alaa sp wall |) \ 
+ thetaxdt*f (i*dx,j*dy, (n+1)*dt) + \ 
(1-theta) *dt*f (ixdx, j*dy,n*dt)) 
uli,j] = omegatu_new + (1-omega)*u_[i,j] 


J = hy 
for i in Iz: 

uli,j] = U_Ly(t[n+1]) # Boundary 

elif version == ’vectorized’: 

j = 0; ul:,j] = U_Oy(t[n+1]) # Boundary 
i= 0; uli,:] = U_Ox(t[nt1]) Boundary 
i = Nx; uli,:] = U_Lx(t[nt1i]) # Boundary 
j = Ny; ul:,j] = U_Ly(t[n+1]) # Boundary 


# Internal points 
f_a_np1i = f(xv, yv, t[n+1]) 
fan = f(xv, yv, t[n]) 
u_new = 1.0/(1.0 + 2*theta*(Fx + Fy))*(theta*(Fx*( 
u_[2:,1:-1] + u_[:-2,1:-1]) + 
Fy*( 
u_[1:-1,2:] + u_[1:-1,:-2])) +\ 
u_n[1:-1,1:-1] + \ 
(1-theta) * (Fx*( 
u_n[2:,1:-1] - 2*u_n[1:-1,1:-1] + u_n[:-2,1:-1]) +\ 
Fy*( 
u_n[i:-1,2:] - 2*u_n[1:-1,1:-1] + u_n[1:-1,:-2]))\ 
+ theta*xdt*f_a_npi[1:-1,1:-1] + \ 
(1-theta) *dt*f_a_n[1:-1,1:-1]) 
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u[i:-1,1:-1] = omega*u_new + (1-omega)*u_[1:-1,1:-1] 
r t= 1 
converged = np.abs(u-u_).max() < tol or r >= max_iter 
ukol Sw 


# Update u_n before next step 
u_n, wu =u, u_n 


The vectorized version should be quite straightforward to understand once one has 
an understanding of how a standard 2D finite stencil is vectorized. 

The first natural verification is to use the test problem in the function quadratic 
from Sect. 3.6.5. This problem is known to have no approximation error, but any 
iterative method will produce an approximate solution with unknown error. For a 
tolerance 10~* in the iterative method, we can, e.g., use a slightly larger tolerance 
10-“*— for the difference between the exact and the computed solution. 


def quadratic(theta, Nx, Ny): 


def assert_small_error(u, x, xv, y, yv, t, n): 
"""Assert small error for iterative methods.""" 
u_e = u_exact(xv, yv, t[n]) 
diff = abs(u - u_e).max() 
tol = 1E-4 
msg = ’diff=/¢, step %id, time=/g° % (diff, n, tinl) 
assert diff < tol, msg 


for version in ’scalar’, ’vectorized’: 
for theta in 1, 0.5: 
print ’testing Jacobi, %s version, theta=/g’ % \ 
(version, theta) 

t, cpu = solver_Jacobi( 
I=I, a=a, f=f, Lx=Lx, Ly=Ly, Nx=Nx, Ny=Ny, 
dt=dt, T=T, theta=theta, 
U_Ox=0, U_Oy=0, U_Lx=0, U_Ly=0, 
user_action=assert_small_error, 
version=version, iteration=’ Jacobi’, 
omega=1.0, max_iter=100, tol=1E-5) 


Even for a very coarse 4 x 4 mesh, the Jacobi method requires 26 iterations to reach 
a tolerance of 107°, which is quite many iterations, given that there are only 25 
unknowns. 


3.6.10 Test Problem: Diffusion of a Sine Hill 


It can be shown that 
2¢7 247-2 T IU 
= Ag (Lx +L sin | —x ) sin | —y J, 3.108 
Ue e sin L: x | sin A y ( ) 


is a solution of the 2D homogeneous diffusion equation u, = a(Uy, + uyy) ina 
rectangle [0, Lx] x [0, L,], for any value of the amplitude A. This solution vanishes 
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at the boundaries, and the initial condition is the product of two sines. We may 
choose A = 1 for simplicity. 

It is difficult to know if our solver based on the Jacobi method works properly 
since we are faced with two sources of errors: one from the discretization, E,, and 
one from the iterative Jacobi method, £;. The total error in the computed u can be 
represented as 

E, = Ea + Ei. 


One error measure is to look at the maximum value, which is obtained for the mid- 
point x = L,/2 and y = L,/2. This midpoint is represented in the discrete u if 
N, and N, are even numbers. We can then compute E,, as E, = | max ue — max u|, 
when we know an exact solution ue of the problem. 

What about £, ? If we use the maximum value as a measure of the error, we have 
in fact analytical insight into the approximation error in this particular problem. 
According to Sect. 3.3.9, the exact solution (3.108) of the PDE problem is also an 
exact solution of the discrete equations, except that the damping factor in time is 
different. More precisely, (3.66) and (3.67) are solutions of the discrete problem for 
0 = 1 (Backward Euler) and 0 = 5 (Crank-Nicolson), respectively. The factors 
raised to the power n is the numerical amplitude, and the errors in these factors 
become 


En = e7ek’t = 


1 —2(F, sin? py + Fy sin? p,) \" 1 
1 + 2(F, sin? py + Fy sin? py) j 


Es =e _ (14 4F, sin? py + 4F, sin? py)", 0 =1. 


We are now in a position to compute E; numerically. That is, we can compute the 
error due to iterative solution of the linear system and see if it corresponds to the 
convergence tolerance used in the method. Note that the convergence is based on 
measuring the difference in two consecutive approximations, which is not exactly 
the error due to the iteration, but it is a kind of measure, and it should have about 
the same size as Ej. 

The function demo_classic_iterative in diffu2D_u0.py implements the 
idea above (also for the methods in Sect. 3.6.12). The value of E; is in particular 
printed at each time level. By changing the tolerance in the convergence criterion 
of the Jacobi method, we can see that E; is of the same order of magnitude as the 
prescribed tolerance in the Jacobi method. For example: Ea ~ 107? with N, = 
N, = 10 and 0 = h, as long as max u has some significant size (maxu > 0.02). 
An appropriate value of the tolerance is then 107°, such that the error in the Jacobi 
method does not become bigger than the discretization error. In that case, Æ; is 
around 5 - 1073. The corresponding number of Jacobi iterations (with œ = 1) 
varies from 31 to 12 during the time simulation (for maxu > 0.02). Changing 
the tolerance to 107° causes many more iterations (61 to 42) without giving any 
contribution to the overall accuracy, because the total error is dominated by Ey. 

Also, with an N, = N, = 20, the spatial accuracy increases and many more 
iterations are needed (143 to 45), but the dominating error is from the time dis- 
cretization. However, with such a finer spatial mesh, a higher tolerance in the 
convergence criterion 1074 is needed to keep E; ~ 1073. More experiments show 
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the disadvantage of the very simple Jacobi iteration method: the number of itera- 
tions increases with the number of unknowns, keeping the tolerance fixed, but the 
tolerance should also be lowered to avoid the iteration error to dominate the total er- 
ror. A small adjustment of the Jacobi method, as described in Sect. 3.6.12, provides 
a better method. 


3.6.11 The Relaxed Jacobi Method and Its Relation to the Forward 
Euler Method 


We shall now show that solving the Poisson equation —wV7u = f by the Jacobi 
iterative method is in fact equivalent to using a Forward Euler scheme on u, = 
aV?u + f and letting t > oo. 
A Forward Euler discretization of the 2D diffusion equation, 
[Du = a(DxDxu + DyDyu) + fli; 
can be written out as 


At 
n+l n n n n n n 2. 
uij = Uy t oh? (ula, + iy Uja Fi ja — 4 +A fis) > 


where h = Ax = Ay has been introduced for simplicity. The scheme can be 
reordered as 


1 
u”t! = (1— wuj j +70 (uta, 


n n n n 2 
ij 4 pH ui Fiji + Uijl +h fi) 


J 


with 


but this latter form is nothing but the relaxed Jacobi method applied to 
[D,D,u + D, Dyu = — f]; - 


From the equivalence above we know a couple of things about the Jacobi method 
for solving —V?u = f: 


1. The method is unstable if œ > 1 (since the Forward Euler method is then unsta- 
ble). 

2. The convergence is really slow as the iteration index increases (coming from the 
fact that the Forward Euler scheme requires many small time steps to reach the 
stationary solution). 


These observations are quite disappointing: if we already have a time-dependent 
diffusion problem and want to take larger time steps by an implicit time discretiza- 
tion method, we will with the Jacobi method end up with something close to a slow 
Forward Euler simulation of the original problem at each time level. Nevertheless, 
the are two reasons for why the Jacobi method remains a fundamental building 
block for solving linear systems arising from PDEs: 1) a couple of iterations re- 
move large parts of the error and this is effectively used in the very efficient class 
of multigrid methods; and 2) the idea of the Jacobi method can be developed into 
more efficient methods, especially the SOR method, which is treated next. 
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3.6.12 The Gauss-Seidel and SOR Methods 


If we update the mesh points according to the Jacobi method (3.98) for a Backward 


Euler discretization with a loop over i = 1,..., Ny — l and j = 1,...,N,—1, 
n+lyr+l1: ners] n+l,r+1 


we realize that when u; ; is computed, wu; _ andu, ;_; are already com- 
puted, so these new values can be used rather PA u; ae ” and us "| (respectively) 


in the formula for utt! "+! This idea gives rise to the Gauss-Seidel iteration 


method, which mathematically is just a small adjustment of (3.98): 


n+l, a 
ui j 


(1+ 2F, +25) (e (apti J Ta 4 F,(u pr + ulti) 


+u; + uaan) . (3.109) 


Observe that the way we access the mesh points in the formula (3.109) is important: 
points with 7 — 1 must be computed before points with į, and points with j — 1 must 
be computed before points with j. Any sequence of mesh points can be used in 
the Gauss-Seidel method, but the particular math formula must distinguish between 
already visited points in the current iteration and the points not yet visited. 

The idea of relaxation (3.101) can equally well be applied to the Gauss-Seidel 
method. Actually, the Gauss-Seidel method with an arbitrary 0 < œ < 2 has its 
own name: the Successive Over-Relaxation method, abbreviated as SOR. 

The SOR method for a @ rule discretization, with the shortened u and u7~ nota- 
tion, can be written 


uy, = (1 + 20(F; + BK)" (otus + uipi gj) + Fylui j-i + 47541) 
+ul) + oaf + A a 
+ (1-0)(F (uf, ; - 2u) + i, a) 
+ F, (ui), — 2u 9 An a )) (3.110) 
uij = wu}; +(U—@)u; | (3.111) 
The sequence of mesh points in (3.110) isi = 1,..., Nx — 1, j =1,...,N,—1 
(but whether i runs faster or slower than j does not matter). 
3.6.13 Scalar Implementation of the SOR Method 


Since the Jacobi and Gauss-Seidel methods with relaxation are so similar, we can 
easily make a common code for the two: 
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for n in It[0:-1]: 
# Solve linear system by Jacobi/SOR iteration at time level n+1 
ak = wens Start valus 
converged = False 


r=0 
while not converged: 
if version == ’scalar’: 
if iteration == ’Jacobi’: 
u =u 
elif iteration == ’SOR’: 
D n 
j=0 
for i in Ix: 
u[i,j] = U_Oy(t[n+1]) # Boundary 
for j in Ly (isda): 
i = 0; uli,j] = U_Ox(t[n+1]) # Boundary 
i = Nx; uli,j] = U_Lx(t[n+1]) # Boundary 
for ai Sia Se [Lil gah] 8 
u_new = 1.0/(1.0 + 2*theta*(Fx + Fy))*(thetax( 
eau [nical sal] ae wh ae sip) ae 
yea [hiya] se a ce N 
u_n[i,j] + (1-theta) *( 
Fx*( 
u_n[it+1,j] - 2*u_n[i,j] + u_n[i-1,j]) + 
Fy*( 
u_n[i,j+1] - 2*u_n[i,j] + u_n[i,j-1]))\ 
+ theta*dt*f (i*dx, j*dy,(n+1)*dt) + \ 
(1-theta) *dt*f (ixdx, j*dy,n*dt)) 
uli,j] = omega*u_new + (1-omega)*u_[i,j] 
a) Ay 
formin Ie: 
uli,j] = U_Ly(t[n+1]) # boundary 
rt=i1 


converged = np.abs(u-u_).max() < tol or r >= max_iter 
ukol So 


un, u =u, un # Get ready for next iteration 


The idea here is to introduce u__ to be used for already computed values (u) in the 
Gauss-Seidel/SOR version of the implementation, or just values from the previous 
iteration (u_) in case of the Jacobi method. 


3.6.14 Vectorized Implementation of the SOR Method 


Vectorizing the Gauss-Seidel iteration step turns out to be non-trivial. The prob- 
lem is that vectorized operations typically imply operations on arrays where the 
sequence in which we visit the elements does not matter. In particular, this prin- 
ciple makes vectorized code trivial to parallelize. However, in the Gauss-Seidel 
algorithm, the sequence in which we visit the elements in the arrays does matter, 
and it is well known that the basic method as explained above cannot be parallelized. 
Therefore, also vectorization will require new thinking. 

The strategy for vectorizing (and parallelizing) the Gauss-Seidel method is to 
use a special numbering of the mesh points called red-black numbering: every other 
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point is red or black as in a checkerboard pattern. This numbering requires N, and 
N, to be even numbers. Here is an example of a 6 x 6 mesh: 


ie 0) Gq \s) fim’ [oy ie 
|): ae Jo): re lo} fe 0) 
te Joy Ge Dr DT 
9) ke |9) re |e) ie 19) 
Te |9) 3@ |e) ae Jo) 1¢ 
DETB Feo} fe D 
reo) Ge? oy ae DIT 


The idea now is to first update all the red points. Each formula for updating a red 
point involves only the black neighbors. Thereafter, we update all the black points, 
and at each black point, only the recently computed red points are involved. 

The scalar implementation of the red-black numbered Gauss-Seidel method is 
really compact, since we can update values directly in u (this guarantees that we 
use the most recently computed values). Here is the relevant code for the Backward 
Euler scheme in time and without a source term: 


# Update internal points 


for sweep in ’red’, ’black’: 
for j in range(1, Ny, 1): 
if sweep == ’red’; 
start = 1 if j % 2 == 1 else 2 
elif sweep == ’black’: 


start = 2 if j % 2 == 1 else l 
for i in range(start, Nx, 2): 
uli,j] = 1.0/(1.0 + 2*(Fx + Fy))*( 
Fx*(u[iti,j] + ufi-1,j]) + 
Fy*(uli,jt+1] + uli,j-1]) + u_nli,j]) 


The vectorized version must be based on slices. Looking at a typical red-black 
pattern, e.g., 


re oy Ge DIr DT 
js) ke Jo) ee) fre" D 
te 9) 3@ 1s) ie Jo) 1¢ 
DETIDO EDITI D 
T Dir [oy ae DIT 
ls) ie |o) Fee) ie |) 
te 0) 5@ |e) ae |o) ie 


we want to update the internal points (marking boundary points with x): 


baie aie ae ane ae aur 4 
Be fe |o) re |e) fe be 
z [oy Ge Joy ae DEZ 
ve fe |): ‘Fe [oy Ge oe 
be 9) 5@ |sy am |o) dx 
22 fe |o) te |e) ig be 
be ae ee ea dam ae 


280 3 Diffusion Equations 


It is impossible to make one slice that picks out all the internal red points. Instead, 
we need two slices. The first involves points marked with R: 


S22 x x eX 
oe 18) Jo i) Ie) Is be 
bq lo) ne DrD 
Soham Re bikes 
L lox a4 [oy ae [oy >< 
og 18) Jor i). [ey Ie), > 
xx x x x x = 


This slice is specified as 1: :2 for i and 1: :2 for j, or with slice objects: 


i = slice(1, None, 2); j = slice(1, None, 2) 


The second slice involves the red points with R: 


222 xx x x 
36 fe lo) i@ |e) Ge) de 
XERRO 
be be |e) te |e) Ge be 
X DARDIR [oy >< 
56 fe lol ae lol Ge de 
xxx ex ES 


The slices are 


i = slice(2, None, 2); j = slice(2, None, 2) 


For the black points, the first slice involves the B points: 


xxx eK x x 
be hm Js) aq) 15) Ge be 
Z 9) 9 DIr [oy de 
STAB ie 15) Ge be 
me BrT 
be ne Js) ae 15} Ge be 
222 x x ex 


with slice objects 


i = slice(2, None, 2); j = slice(1, None, 2) 
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The second set of black points is shown here: 


ARARA 
me fie Jo) re [o)) fe Z 
pe |b} ae 15} ae Ie) oe 
2 ae lo) be |e) ie be 
ye |b) Ge JB} re! JD oe 
ye ie’ |e) re [oy Ge Oe 
Cae ae a ae ay ae 3 


with slice objects 


i = slice(1, None, 2); j = slice(2, None, 2) 


That is, we need four sets of slices. The simplest way of implementing the 
algorithm is to make a function with variables for the slices representing i, i — 1, 
i +1, j,j— l,and j + 1, here called ic (“i center”), im1 (“i minus 1”, ip1 (“i 
plus 1”), jc, jm1, and jp1, respectively. 


def update(u_, u n, ic, imi, ipl, jc, jmi, jp1): 
return \ 
1.0/(1.0 + 2*thetax(Fx + Fy))*(thetax( 
Fx*(u_[ipi,jc] + u_[im1,jc]) + 
Fy*(u_[ic,jpi] + u_[ic,jmi])) +\ 
u_n[ic,jc] + (1-theta)*( 
Fx*(u_n[ip1,jc] - 2*u_n[ic,jc] + u_n[imi,jc]) +\ 
Fy*(u_n[ic,jpi] - 2*u_n[ic,jc] + u_n[ic,jm1]))+\ 
theta*dt*f_a_npi[ic,jc] + \ 
(1-theta) *dt*f_a_n[ic,jc]) 


The formula returned from update is to be compared with (3.110). 
The relaxed Jacobi iteration can be implemented by 


aie) je slice(1,-1) 
imi = jm1 = slice(0,-2) 
ipl = jp1 = slice(2,None) 
u_new[ic,jc] = update( 
Diy Windy ake, ahiul, ay, Se, apatl, shel) 
ulic,jc] = omega*tu_new[ic,jc] + (1-omega)*u_[ic, jc] 


The Gauss-Seidel (or SOR) updates need four different steps. The ic and jc 
slices are specified above. For each of these, we must specify the corresponding 
imi, ip1, jm1, and jp1 slices. The code below contains the details. 
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# Red points 
ic = slice(1,-1,2) 
imi = slice(0,-2,2) 
ip1 = slice(2,None,2) 
je = elica L i2) 
jmi = slice(0,-2,2) 
jpt = slice(2,None,2) 
u_new[ic,jc] = update( 
u_new, u_n, ic, imi ipl, jc, Imi jp1) 


ic = slice(2,-1,2) 

imi = slice(1,-2,2) 

ip1 = slice(3,None,2) 

jc = slice(2,-1,2) 

jmi = slice(1,-2,2) 

jp1 = slice(3,None,2) 

u_new[ic,jc] = update( 

Ulnew, un, ic, imi, ipl, je, jmi, jpi) 


# Black points 
ic = slice(2,-1,2) 
imi = slice(1,-2,2) 
ip1 = slice(3,None,2) 
jc = slice(1,-1,2) 
jmi = slice(0,-2,2) 
jp1 = slice(2,None,2) 
u_new[ic,jc] = update( 
PLIN Sy Hil, aie, sil, ila Gey aout, ape) 


ic = slice(1,-1,2) 

imi = slice(0,-2,2) 

ip1 = slice(2,None,2) 

jc = slice(2,-1,2) 

jmi = slice(1,-2,2) 

jp1 = slice(3,None,2) 

u_new[ic,jc] = update( 

PLIN iy Mil, ake ail, ol Ge, goul, a) 


# Relax 
c = slice 1) 
u[c,c] = omega*u_new[c,c] + (1-omega)*u_[c,c] 


The function solver_classic_iterative in diffu2D_u0. py contains a uni- 
fied implementation of the relaxed Jacobi and SOR methods in scalar and vectorized 
versions using the techniques explained above. 


3.6.15 Direct Versus Iterative Methods 


Direct methods There are two classes of methods for solving linear systems: di- 
rect methods and iterative methods. Direct methods are based on variants of the 
Gaussian elimination procedure and will produce an exact solution (in exact arith- 
metics) in an a priori known number of steps. Iterative methods, on the other hand, 
produce an approximate solution, and the amount of work for reaching a given ac- 
curacy is usually not known. 
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The most common direct method today is to use the LU factorization procedure 
to factor the coefficient matrix A as the product of a lower-triangular matrix L (with 
unit diagonal terms) and an upper-triangular matrix U: A = LU. As soon as we 
have L and U, a system of equations LUc = b is easy to solve because of the 
triangular nature of L and U. We first solve Ly = b for y (forward substitution), 
and thereafter we find c from solving Uc = y (backward substitution). When A is 
adense N x N matrix, the LU factorization costs iN 3 arithmetic operations, while 
the forward and backward substitution steps each require of the order N? arithmetic 
operations. That is, factorization dominates the costs, while the substitution steps 
are cheap. 

Symmetric, positive definite coefficient matrices often arise when discretizing 
PDEs. In this case, the LU factorization becomes A = LL’, and the associated 
algorithm is known as Cholesky factorization. Most linear algebra software offers 
highly optimized implementations of LU and Cholesky factorization as well as for- 
ward and backward substitution (scipy.1linalg is the relevant Python package). 

Finite difference discretizations lead to sparse coefficient matrices. An extreme 
case arose in Sect. 3.2.1 where A was tridiagonal. For a tridiagonal matrix, the 
amount of arithmetic operations in the LU and Cholesky factorization algorithms 
is just of the order N, not N°. Tridiagonal matrices are special cases of banded 
matrices, where the matrices contain just a set of diagonal bands. Finite difference 
methods on regularly numbered rectangular and box-shaped meshes give rise to 
such banded matrices, with 5 bands in 2D and 7 in 3D for diffusion problems. 
Gaussian elimination only needs to work within the bands, leading to much more 
efficient algorithms. 

If Aij = 0 for j > i+ p andj <i — p, pis the half-bandwidth of the 
matrix. We have in our 2D problem p = Ny + 2, while in 3D, p = (Ny + 1)(Ny + 
1) + 2. The cost of Gaussian elimination is then O(Np), so with p < N, we 
see that banded matrices are much more efficient to compute with. By reordering 
the unknowns in clever ways, one can reduce the work of Gaussian elimination 
further. Fortunately, the Python programmer has access to such algorithms through 
the scipy.sparse.linalg package. 

Although a direct method is an exact algorithm, rounding errors may in practice 
accumulate and pollute the solution. The effect grows with the size of the linear 
system, so both for accuracy and efficiency, iterative methods are better suited than 
direct methods for solving really large linear systems. 


Iterative methods The Jacobi and SOR iterative methods belong to a class of it- 
erative methods where the idea is to solve Au = b by splitting A into two parts, 
A = M — N, such that solving systems Mu = c is easy and efficient. With the 
splitting, we get a system 

Mu = Nu +b, 


which suggests an iterative method 


Mut! = Nu" +b, r=0,1,2,..., 


where u”+! is a new approximation to u in the r + 1-th iteration. To initiate the 
0 


iteration, we need a start vector u”. 
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The Jacobi and SOR methods are based on splitting A into a lower tridiagonal 
part L, the diagonal D, and an upper tridiagonal part U, such that A = L+ D +U. 
The Jacobi method corresponds to M = D and N = —L — U. The Gauss-Seidel 
method employs M = L + D and N = —U, while the SOR method corresponds 
to 

1 l-o 
M=—D+L, N = ——D-U. 
w w 


The relaxed Jacobi method has similar expressions: 


1 l-@ 
M = =D, N =——D-L-U. 
w w 


With the matrix forms of the Jacobi and SOR methods as written above, we could 
in an implementation alternatively fill the matrix A with entries and call general 
implementations of the Jacobi or SOR methods that work on a system Au = b. 
However, this is almost never done since forming the matrix A requires quite some 
code and storing A in the computer’s memory is unnecessary. It is much easier to 
just apply the Jacobi and SOR ideas to the finite difference stencils directly in an 
implementation, as we have shown in detail. 

Nevertheless, the matrix formulation of the Jacobi and SOR methods have been 
important for analyzing their convergence behavior. One can show that the error 
u” — u fulfills u” —u = G” (u? — u), where G = M7!N and G* is a matrix 
exponential. For the method to converge, lim, ||G"|| = 0 is a necessary and 
sufficient condition. This implies that the spectral radius of G must be less than 
one. Since G is directly related to the finite difference scheme for the underlying 
PDE problem, one can in principle compute the spectral radius. For a given PDE 
problem, however, this is not a practical strategy, since it is very difficult to de- 
velop useful formulas. Analysis of model problems, usually related to the Poisson 
equation, reveals some trends of interest: the convergence rate of the Jacobi method 
goes like h*, while that of SOR with an optimal w goes like h, where h is the spa- 
tial spacing: h = Ax = Ay. That is, the efficiency of the Jacobi method quickly 
deteriorates with the increasing mesh resolution, and SOR is much to be preferred 
(even if the optimal w remains an open question). We refer to Chapter 4 of [16] for 
more information on the convergence theory. One important result is that if A is 
symmetric and positive definite, then SOR will converge for any 0 < w < 2. 

The optimal w parameter can be theoretically established for a Poisson problem 
as 


_ 2 _ cos(z/Nx) + (Ax/ Ay)? cos(a/ Ny) 
“Tigi = 1+ (Ax/Ay)? l 


This formula can be used as a guide also in other problems. 

The Jacobi and the SOR methods have their great advantage of being trivial 
to implement, so they are obviously popular of this reason. However, the slow 
convergence of these methods limits the popularity to fairly small linear systems 
(1.e., coarse meshes). As soon as the matrix size grows, one is better off with more 
sophisticated iterative methods like the preconditioned Conjugate gradient method, 
which we now turn to. 


Wo (3.112) 
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Finally, we mention that there is a variant of the SOR method, called the Symmet- 
ric Successive Over-relaxation method, known as SSOR, where one runs a standard 
SOR sweep through the mesh points and then a new sweep while visiting the points 
in reverse order. 


3.6.16 The Conjugate Gradient Method 


There is no simple intuitive derivation of the Conjugate gradient method, so we refer 
to the many excellent expositions in the literature for the idea of the method and 
how the algorithm is derived. In particular, we recommend the books [1, 2, 5, 16]. 
A brief overview is provided in the Wikipedia article*. Here, we just state the pros 
and cons of the method from a user’s perspective and how we utilize it in code. 

The original Conjugate gradient method is limited to linear systems Au = b, 
where A is a symmetric and positive definite matrix. There are, however, extensions 
of the method to non-symmetric matrices. 

A major advantage of all conjugate gradient methods is that the matrix A is 
only used in matrix-vector products, so we do not need form and store A if we can 
provide code for computing a matrix-vector product Au. Another important feature 
is that the algorithm is very easy to vectorize and parallelize. The primary downside 
of the method is that it converges slowly unless one has an effective preconditioner 
for the system. That is, instead of solving Au = b, we try to solve M~!Au = 
M~'D in the hope that the method works better for this preconditioned system. The 
matrix M is the preconditioner or preconditioning matrix. Now we need to perform 
matrix-vector products y = M~!Au, which is done in two steps: first the matrix- 
vector product v = Au is carried out and then the system My = v must be solved. 
Therefore, M must be cheap to compute and systems My = v must be cheap to 
solve. 

A perfect preconditioner is M = A, but in each iteration in the Conjugate gra- 
dient method one then has so solve a system with A as coefficient matrix! A key 
idea is to let M be some kind of cheap approximation to A. The simplest precondi- 
tioner is to set M = D, where D is the diagonal of A. This choice means running 
one Jacobi iteration as preconditioner. Exercise 3.8 shows that the Jacobi and SOR 
methods can also be viewed as preconditioners. 

Constructing good preconditioners is a scientific field on its own. Here we shall 
treat the topic just very briefly. For a user having access to the scipy.sparse. 
linalg library, there are Conjugate gradient methods and preconditioners readily 
available: 


e For positive definite, symmetric systems: cg (the Conjugate gradient method) 
e For symmetric systems: minres (Minimum residual method) 
e For non-symmetric systems: 

— gmres (GMRES: Generalized minimum residual method) 

— bicg (BiConjugate gradient method) 

— bicgstab (Stabilized BiConjugate gradient method) 


4 https://en.wikipedia.org/wiki/Conjugate_gradient_method 
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— cgs (Conjugate gradient squared method) 
— qmr (Quasi-minimal residual iteration) 
e Preconditioner: spilu (Sparse, incomplete LU factorization) 


The ILU preconditioner is an attractive all-round type of preconditioner that is suit- 
able for most problems on serial computers. A more efficient preconditioner is 
the multigrid method, and algebraic multigrid is also an all-round choice as pre- 
conditioner. The Python package PyYAMG? offers efficient implementations of the 
algebraic multigrid method, to be used both as a preconditioner and as a stand-alone 
iterative method. 

The matrix arising from implicit time discretization methods applied to the dif- 
fusion equation is symmetric and positive definite. Thus, we can use the Conjugate 
gradient method (cg), typically in combination with an ILU preconditioner. The 
code is very similar to the one we created when solving the linear system by sparse 
Gaussian elimination, the main difference is that we now allow for calling up the 
Conjugate gradient function as an alternative solver. 


def solver_sparse( 
Ib, lp a Ibe, ad Whe, Ways Chap Wy Wace. 
U_Ox=0, U_Oy=0, U_Lx=0, U_Ly=0, user_action=None, 
method=’direct’, CG_prec=’ILU’, CG_tol=1E-5): 
woe 
Full solver for the model problem using the theta-rule 
difference approximation in time. Sparse matrix with 
dedicated Gaussian elimination algorithm (method=’ direct’) 
or ILU preconditioned Conjugate Gradients (method=’CG’ with 
tolerance CG_tol and preconditioner CG_prec (’ILU’ or None)). 


nun 


# Set up data structures as shown before 


# Precompute sparse matrix 


A = scipy.sparse.diags( 
diagonals=[main, lower, upper, lower2, upper2], 
offsets=[0, -lower_offset, lower_offset, 
-lower2_offset, lower2_offset], 
shape=(N, N), format=’csc’) 


if method == ’CG’; 

if CG_prec == ’ILU’: 
# Find ILU preconditioner (constant in time) 
A_ilu = scipy.sparse.linalg.spilu(A) # SuperLU defaults 
M = scipy.sparse.linalg.LinearOperator ( 

shape=(N, N), matvec=A_ilu.solve) 

else: 
M = None 

CG_iter = [] # No of CG iterations at time level n 


# Time loop 
forn in TEOSI: 
# Compute b, vectorized version 


5 https://github.com/pyamg/pyamg 
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# Solve matrix system A*c = b 


if method == ’direct’: 
c = scipy.sparse.linalg.spsolve(A, b) 
elif method == ’CG’; 


x0 = u_n.T.reshape(N) # Start vector is u_n 
CG_iter.append(0) 


def CG_callback(c_k): 
Trick to count thesno of aiterataons ImiCG HE 
CG_iter[-1] += 1 


c, info = scipy.sparse.linalg.cg( 
A, b, x0=x0, tol=CG_tol, maxiter=N, M=M, 
callback=CG_callback) 


# Fill u with vector c 
# Update u_n before next step 
Lin, WS iy maa 


The number of iterations in the Conjugate gradient method is of interest, but is 
unfortunately not available from the cg function. Therefore, we perform a trick: 
in each iteration a user function CG_callback is called where we accumulate the 
number of iterations in a list CG_iter. 


3.6.17 What Is the Recommended Method for Solving Linear 
Systems? 


There is no clear answer to this question. If you have enough memory and comput- 
ing time available, direct methods such as spsolve are to be preferred since they 
are easy to use and finds almost an exact solution. However, in larger 2D and in 
3D problems, direct methods usually run too slowly or require too much memory, 
so one is forced to use iterative methods. The fastest and most reliable methods are 
in the Conjugate Gradient family, but these require suitable preconditioners. ILU is 
an all-round preconditioner, but it is not suited for parallel computing. The Jacobi 
and SOR iterative methods are easy to implement, and popular for that reason, but 
run slowly. Jacobi iteration is not an option in real problems, but SOR may be. 


3.7 Random Walk 


Models leading to diffusion equations, see Sect. 3.8, are usually based on reasoning 
with averaged physical quantities such as concentration, temperature, and velocity. 
The underlying physical processes involve complicated microscopic movement of 
atoms and molecules, but an average of a large number of molecules is performed 
in a small volume before the modeling starts, and the averaged quantity inside this 
volume is assigned as a point value at the centroid of the volume. This means that 
concentration, temperature, and velocity at a space-time point represent averages 
around the point in a small time interval and small spatial volume. 
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Random walk is a principally different kind of modeling procedure compared to 
the reasoning behind partial differential equations. The idea in random walk is to 
have a large number of “particles” that undergo random movements. Averaging can 
then be used afterwards to compute macroscopic quantities like concentration. The 
“particles” and their random movement represent a very simplified microscopic be- 
havior of molecules, much simpler and computationally much more efficient than 
direct molecular simulation®, yet the random walk model has been very powerful to 
describe a wide range of phenomena, including heat conduction, quantum mechan- 
ics, polymer chains, population genetics, neuroscience, hazard games, and pricing 
of financial instruments. 

It can be shown that random walk, when averaged, produces models that are 
mathematically equivalent to diffusion equations. This is the primary reason why 
we treat random walk in this chapter: two very different algorithms (finite difference 
stencils and random walk) solve the same type of problems. The simplicity of 
the random walk algorithm makes it particularly attractive for solving diffusion 
equations on massively parallel computers. The exposition here is as simple as 
possible, and good thorough derivation of the models is provided by Hjorth-Jensen 


[7]. 


3.7.1 Random Walk in 1D 


Imagine that we have some particles that perform random moves, either to the right 
or to the left. We may flip a coin to decide the movement of each particle, say head 
implies movement to the right and tail means movement to the left. Each move is 
one unit length. Physicists use the term random walk for this type of movement. 
The movement is also known as drunkard’s walk’. You may try this yourself: flip 
the coin and make one step to the left or right, and repeat the process. 

We introduce the symbol N for the number of steps in a random walk. Fig- 
ure 3.16 shows four different random walks with N = 200. 


3.7.2 Statistical Considerations 


Let S; be the stochastic variable representing a step to the left or to the right in step 
number k. We have that S$, = —1 with probability p and Są = 1 with probability 
q = 1 — p. The variable S; is known as a Bernoulli variable®. The expectation of 
Sk is 

E[S;] = p-(-l) +q-1=1-2p, 


and the variance is 


Var(Sx) = E[S] — E[S? = 1— (1 — 2p)? = 4p(1 — p). 


6 https://en.wikipedia.org/wiki/Molecular_dynamics 
7 https://en.wikipedia.org/wiki/The_Drunkard%27s_Walk 
8 https://en.wikipedia.org/wiki/Bernoulli_distribution 
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Fig. 3.16 Ensemble of 4 random walks, each with 200 steps 


The position after k steps is another stochastic variable 


k-1 
X=% S. 
i=0 
The expected position is 
k-1 k-1 
E[X;] = E (£ si) = X F[S:] = k0- 2p). 
i=0 i=0 


All the S; variables are independent. The variance therefore becomes 


k-1 k-1 
Var(X;.) = Var (£ si) = $ Var(S;) = k4p(1 — p). 


i=0 i=0 


We see that Var(X,) is proportional with the number of steps k. For the very im- 
portant case p = q = L, E[X;] = 0 and Var(X;) = k. 

How can we estimate E[X;] = 0 and Var(X;.) = N? We must have many 
random walks of the type in Fig. 3.16. For a given k, say k = 100, we find all 
the values of X k, name them Xo,k, 1,4, X2,k, and so on. The empirical estimate of 
E[X;] is the average, 


, Wo 
E[X;] ~ wW yA Xjk, 
j=0 
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while an empirical estimate of Var(X;,) is 


1 W-—1 1 W-—1 k 
Var(X) ~ y Dia — | DU He 
j=0 j=0 


That is, we take the statistics for a given K across the ensemble of random walks 
(“‘vertically” in Fig. 3.16). The key quantities to record are }°; x; and >); Ris 


3.7.3 Playing Around with Some Code 


Scalar code Python has a random module for drawing random numbers, and this 
module has a function uniform(a, b) for drawing a uniformly distributed random 
number in the interval [a, b). If an event happens with probability p, we can sim- 
ulate this on the computer by drawing a random number r in [0, 1), because then 
r < p with probability p andr > p with probability 1 — p: 


import random 
r = random.uniform(0, 1) 
if r <= p: 
# Event happens 
else: 
# Event does not happen 


A random walk with N steps, starting at xọ, where we move to the left with proba- 
bility p and to the right with probability 1 — p can now be implemented by 


import random, numpy as np 


def random_walk1D(x0, N, p): 
"""4D random walk with 1 particle.""" 
# Store position in step k in position[k] 
position = np.zeros(N) 
position[0] = x0 
current_pos = x0 
for k in range(N-1): 
r = random.uniform(0, 1) 


if r <= p: 

current pos = 1 
else: 

current_pos += 1 
position[k+1] = current_pos 


return position 


Vectorized code Since N is supposed to be large and we want to repeat the process 
for many particles, we should speed up the code as much as possible. Vectorization 
is the obvious technique here: we draw all the random numbers at once with aid 
of numpy, and then we formulate vector operations to get rid of the loop over the 
steps (k). The numpy . random module has vectorized versions of the functions in 
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Python’s built-in random module. For example, numpy.random.uniform(a, b, 
N) returns N random numbers uniformly distributed between a (included) and b (not 
included). 

We can then make an array of all the steps in a random walk: if the random 
number is less than or equal to p, the step is —1, otherwise the step is 1: 


r = np.random.uniform(0, 1, size=N) 
steps = np.where(r <= p, -1, 1) 


The value of position[k] is the sum of all steps up to step k. Such sums are 
often needed in vectorized algorithms and therefore available by the numpy . cumsum 
function: 


>>> import numpy as np 
>>> np.cumsum(np.array([1,3,4,6])) 
array([ 1, 4, 8, 14]) 


The resulting array in this demo has elements 1, 1 + 3 = 4, 1 + 3 + 4 = 8, and 
1+3+4+6= 14. 
We can now vectorize the random_wa1k1D function: 


def random_walkiD_vec(x0, N, p): 
"""Vectorized version of random_walkiD.""" 
# Store position in step k in position[k] 
position = np.zeros(N+1) 
position[0] = x0 
r = np.random.uniform(0, 1, size=N) 
steps = np.where(r <= p, -1, 1) 
position[1:] = x0 + np.cumsum(steps) 
return position 


This code runs about 10 times faster than the scalar version. With a parallel numpy 
library, the code can also automatically take advantage of hardware for parallel 
computing because each of the four array operations can be trivially parallelized. 


Fixing the random sequence During software development with random numbers 
it is advantageous to always generate the same sequence of random numbers, as this 
may help debugging processes. To fix the sequence, we set a seed of the random 
number generator to some chosen integer, e.g., 


np.random.seed(10) 


Calls to random_walk1D_vec give positions of the particle as depicted in Fig. 3.17. 
The particle starts at the origin and moves with p = i Since the seed is the same, 
the plot to the left is just a magnification of the first 1000 steps in the plot to the 
right. 
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Fig. 3.17 1000 (left) and 50,000 (right) steps of a random walk 


Verification When we have a scalar and a vectorized code, it is always a good idea 
to develop a unit test for checking that they produce the same result. A problem 
in the present context is that the two versions apply two different random number 
generators. For a test to be meaningful, we need to fix the seed and use the same 
generator. This means that the scalar version must either use np . random or have 
this as an option. An option is the most flexible choice: 


import random 
def random_walkiD(x0, N, p, random=random) : 


r = random.uniform(0, 1) 


Using random=np.randon, the r variable gets computed by np. random. uniform, 
and the sequence of random numbers will be the same as in the vectorized version 
that employs the same generator (given that the seed is also the same). A proper test 
function may be to check that the positions in the walk are the same in the scalar 
and vectorized implementations: 


def test_random_walk1D(): 
# For fixed seed, check that scalar and vectorized versions 
# produce the same result 
x0 = 2; N=4; p=0.6 
np.random.seed(10) 
scalar_computed = random_walkiD(x0, N, p, random=np.random) 
np.random.seed(10) 
vectorized_computed = random_walkiD_vec(x0, N, p) 
assert (scalar_computed == vectorized_computed) .all1() 


Note that we employ == for arrays with real numbers, which is normally an inad- 
equate test due to rounding errors, but in the present case, all arithmetics consists 
of adding or subtracting one, so these operations are expected to have no rounding 
errors. Comparing two numpy arrays with == results in a boolean array, so we need 
to call the a11 () method to ensure that all elements are True, i.e., that all elements 
in the two arrays match each other pairwise. 
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3.7.4 Equivalence with Diffusion 


The original random walk algorithm can be said to work with dimensionless co- 


ordinates ¥; = —N +i,i = 0,1,...,2N +1 (i € [-N,N]), andi, = n, 
n =0,1,...,N. A mesh with spacings Ax and At with dimensions can be intro- 
duced by 


Xi = Xo + x; Ax, ty =t,At. 


If we implement the algorithm with dimensionless coordinates, we can just use this 
rescaling to obtain the movement in a coordinate system without unit spacings. 

Let prr be the probability of finding the particle at mesh point X; at time f, +1. 
We can reach mesh point (7,1 + 1) in two ways: either coming in from the left 
from (i — 1,7) or from the right (i + 1,7). Each has probability 5 (if we assume 
P=q= $). The fundamental equation for P”+' is 


n+l 1 n 1 n 
P~ = a + 5 Pit: (3.113) 
(This equation is easiest to understand if one looks at the random walk as a Markov 
process and applies the transition probabilities, but this is beyond scope of the 
present text.) 
Subtracting P; from (3.113) results in 


n+l n l n n 1 n 
piapa (T= Ra] 


Readers who have seen the Forward Euler discretization of a 1D diffusion equation 
recognize this scheme as very close to such a discretization. We have 


n+l = n 


ð P! 
= P (xi, ta) = — O(At), 
apf itn) a PMAN 


or in dimensionless coordinates 
a P(x Ti, N pr P” 
= (Xi, tn) RE; =i. 
dt 

Similarly, we have 


3? Pr, —2P" + 1PM 
=P isbn) = i i at 
ax? (xi, tn) Ax? 


Hl O(Ax?), 
2. 


a = Pa n n 1 n 
gar Givin) & Py — QP? + 5 PRL 


Equation (3.113) is therefore equivalent with the dimensionless diffusion equation 


oP leer (3.114) 
dt 2 0x2” 
or the diffusion equation 
oP 3P 
= D— (3.115) 


Ot əx?” 
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with diffusion coefficient 
Ax? 


2At ` 

This derivation shows the tight link between random walk and diffusion. If we 
keep track of where the particle is, and repeat the process many times, or run the 
algorithms for lots of particles, the histogram of the positions will approximate the 
solution of the diffusion equation for the local probability P”. 

Suppose all the random walks start at the origin. Then the initial condition for 
the probability distribution is the Dirac delta function 6(x). The solution of (3.114) 
can be shown to be 


P, D = ——— e t, (3.116) 


where œ = }. 


3.7.5 Implementation of Multiple Walks 


Our next task is to implement an ensemble of walks (for statistics, see Sect. 3.7.2) 
and also provide data from the walks such that we can compute the probabilities of 
the positions as introduced in the previous section. An appropriate representation of 
probabilities P” are histograms (with 7 along the x axis) for a few selected values 
ofn. 

To estimate the expectation and variance of the random walks, Sect. 3.7.2 points 
to recording }` j Xjk and D j Xio where x; x is the position at time/step level k in 
random walk number j. The histogram of positions needs the individual values x; x 
for all i values and some selected k values. 

We introduce position [k] to hold 2; Xjk, position2[k] to hold ae (xj4)?, 
and pos_hist [i,k] to hold x; x. A selection of k values can be specified by saying 
how many, num_times, and let them be equally spaced through time: 


pos_hist_times = [(N//num_times)*i for i in range(num_times)] 


This is one of the few situations where we want integer division (//) or real division 
rounded to an integer. 


Scalar version Our scalar implementation of running num_walks random walks 
may go like this: 


def random_walks1D(x0, N, p, num_walks=1, num_times=1, 

random=random) : 

"""Simulate num_walks random walks from x0 with N steps.""" 

position = np.zeros(Nt1) # Accumulated positions 

position[0] = x0*num_walks 

position2 = np.zeros(N+1)  # Accumulated positions**2 

position2[0] = x0**2*num_walks 

# Histogram at num_times selected time points 

pos_hist = np.zeros((num_walks, num_times)) 

pos_hist_times = [(N//num_times)*i for i in range(num_times)] 

#print ’save hist:’, post_hist_times 
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for n in range(num_walks): 
num_times_counter = 0 
current_pos = x0 
for k in range(N): 
if k in pos_hist_times: 
#print ’save, k:’, k, num_times_counter, n 
pos_hist[n,num_times_counter] = current_pos 
num_times_counter += 1 
# current_pos corresponds to step k+1 
r = random.uniform(0, 1) 
if r <= p: 
current_pos -= 1 
else: 
current_pos += 1 
position [k+1] += current_pos 
position2[k+1] += current_pos**2 
return position, position2, pos_hist, np.array(pos_hist_times) 


Vectorized version We have already vectorized a single random walk. The 
additional challenge here is to vectorize the computation of the data for the his- 
togram, pos_hist, but given the selected steps in pos_hist_times, we can 
find the corresponding positions by indexing with the list pos_hist_times: 
position [post_hist_times], which are to be inserted in pos_hist [n,:]. 


def random_walksiD_veci(x0, N, p, num_walks=1, num_times=1): 
"""Vectorized version of random_walksiD.""" 


position = np.zeros(N+1) # Accumulated positions 
position2 = np.zeros(N+1) # Accumulated positions**2 
walk = np.zeros(N+1) # Positions of current walk 
walk[0] = x0 


# Histogram at num_times selected time points 
pos_hist = np.zeros((num_walks, num_times)) 
pos_hist_times = [(N//num_times)*i for i in range(num_times)] 


for n in range(num_walks): 
r = np.random.uniform(0, 1, size=N) 
steps = np.where(r <= p, -1, 1) 
walk[1:] = x0 + np.cumsum(steps) # Positions of this walk 
position += walk 
position2 += walk**2 
pos_hist[n,:] = walk[pos_hist_times] 
return position, position2, pos_hist, np.array(pos_hist_times) 


Improved vectorized version Looking at the vectorized version above, we still 
have one potentially long Python loop over n. Normally, num_walks will be much 
larger than N. The vectorization of the loop over N certainly speeds up the program, 
but if we think of vectorization as also a way to parallelize the code, all the in- 
dependent walks (the n loop) can be executed in parallel. Therefore, we should 
include this loop as well in the vectorized expressions, at the expense of using more 
memory. 

We introduce the array walks to hold the N + | steps of all the walks: each row 
represents the steps in one walk. 
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walks = np.zeros((num_walks, N+1)) # Positions of each walk 
walks[:,0] = x0 


Since all the steps are independent, we can just make one long vector of enough 
random numbers (N*num_walks), translate these numbers to +1, then we reshape 
the array such that the steps of each walk are stored in the rows. 


r = np.random.uniform(0, 1, size=N*num_walks) 
steps = np.where(r <= p, -1, 1).reshape(num_walks, N) 


The next step is to sum up the steps in each walk. We need the np. cumsum func- 
tion for this, with the argument axis=1 for indicating a sum across the columns: 


>>> a = np.arange(6) .reshape(2,3) 
>>> a 
array(([[0O, 1, 2], 
(3, 4, 5]]) 
>>> np.cumsum(a, axis=1) 
cram wACILIE @, a, Sil 5 
ES, To AD 


Now walks can be computed by 


walks[:,1:] = x0 + np.cumsum(steps, axis=1) 


The position vector is the sum of all the walks. That is, we want to sum all the 
rows, obtained by 


position = np.sum(walks, axis=0) 


A corresponding expression computes the squares of the positions. Finally, we need 
to compute pos_hist, but that is a matter of grabbing some of the walks (according 
to pos_hist_times): 


pos_hist[:,:] = walks[:,pos_hist_times] 


The complete vectorized algorithm without any loop can now be summarized: 


def random_walks1D_vec2(x0, N, p, num_walks=1, num_times=1): 
"""Vectorized version of random_walks1D; no loops.""" 
position = np.zeros(N+1) # Accumulated positions 
position2 = np.zeros(N+1) # Accumulated positions**2 
walks = np.zeros((num_walks, N+1)) # Positions of each walk 
walks[:,0] = x0 
# Histogram at num_times selected time points 
pos_hist = np.zeros((num_walks, num_times)) 
pos_hist_times = [(N//num_times)*i for i in range(num_times)] 
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r = np.random.uniform(0, 1, size=N*num_walks) 

steps = np.where(r <= p, -1, 1).reshape(num_walks, N) 
walks[:,1:] = x0 + np.cumsum(steps, axis=1) 

position = np.sum(walks, axis=0) 

position2 = np.sum(walks**2, axis=0) 

pos_hist[:,:] = walks[:,pos_hist_times] 

return position, position2, pos_hist, np.array(pos_hist_times) 


What is the gain of the vectorized implementations? One important gain is that 
each vectorized operation can be automatically parallelized if one applies a parallel 
numpy library like Numba’. On a single CPU, however, the speed up of the vec- 
torized operations is also significant. With N = 1000 and 50,000 repeated walks, 
the two vectorized versions run about 25 and 18 times faster than the scalar version, 
with random_walks1D_vect1 being fastest. 


Remark on vectorized code and parallelization Our first attempt on vectoriza- 
tion removed the loop over the N steps in a single walk. However, the number of 
walks is usually much larger than N, because of the need for accurate statistics. 
Therefore, we should rather remove the loop over all walks. It turns out, from our 
efficiency experiments, that the function random_walks1D_vec2 (with no loops) is 
slower than random_walks1D_vect1. This is a bit surprising and may be explained 
by less efficiency in the statements involving very large arrays, containing all steps 
for all walks at once. 

From a parallelization and improved vectorization point of view, it would be 
more natural to switch the sequence of the loops in the serial code such that the 
shortest loop is the outer loop: 


def random_walksiD2(x0, N, p, num_walks=1, num_times=1, ...): 


current_pos = x0 + np.zeros(num_walks) 
num_times_counter = -1 


for k in range(N): 
if k in pos_hist_times: 
num_times_counter += 1 
store_hist = True 

else: 
store_hist = False 


for n in range(num_walks): 
# current_pos corresponds to step k+1 
r = random.uniform(0, 1) 
if r <= p: 
current_pos[n] -= 1 
else: 
current_pos[n] += 1 
position [k+1] += current_pos[n] 
position2[k+1] += current_pos[n]**2 
if store_hist: 
pos_hist[n,num_times_counter] = current_pos[n] 
return position, position2, pos_hist, np.array(pos_hist_times) 


? http://numba.pydata.org 


298 3 Diffusion Equations 


The vectorized version of this code, where we just vectorize the loop over n, be- 
comes 


def random_walks1D2_vec1(x0, N, p, num_walks=1, num_times=1): 
"""Vectorized version of random_walksiD2.""" 
position = np.zeros(N+1) # Accumulated positions 
position2 = np.zeros(N+1) # Accumulated positions**2 
# Histogram at num_times selected time points 
pos_hist = np.zeros((num_walks, num_times)) 
pos_hist_times = [(N//num_times)*i for i in range(num_times)] 


current_pos = np.zeros(num_walks) 
current_pos[0] = x0 
num_times_counter = -1 


for k in range(N): 

if k in pos_hist_times: 

num_times_counter += 1 

store_hist = True # Store histogram data for this k 
else: 

store_hist = False 


# Move all walks one step 
r = np.random.uniform(0, 1, size=num_walks) 
steps = np.where(r <= p, -1, 1) 
current_pos += steps 
position[k+1] = np.sum(current_pos) 
position2[k+1] = np.sum(current_pos**2) 
if store_hist: 
pos_hist[:,num_times_counter] = current_pos 
return position, position2, pos_hist, np.array(pos_hist_times) 


This function runs significantly faster than the random_walks1iD_vec1 function 
above, typically 1.7 times faster. The code is also more appropriate in a paral- 
lel computing context since each vectorized statement can work with data of size 
num_walks over the compute units, repeated N times (compared with data of size 
N, repeated num_walks times, in random_walks1D_vect). 

The scalar code with switched loops, random_walks1D2 runs a bit slower than 
the original code in random_walks1D, so with the longest loop as the inner loop, 
the vectorized function random_walks1D2_vect1 is almost 60 times faster than the 
scalar counterpart, while the code random_walksiD_vec2 without loops is only 
around 18 times faster. Taking into account the very large arrays required by the 
latter function, we end up with random_walks1D2_veci as the preferred imple- 
mentation. 


Test function During program development, it is highly recommended to carry out 
computations by hand for, e.g., N=4 and num_walks=3. Normally, this is done by 
executing the program with these parameters and checking with pen and paper that 
the computations make sense. The next step is to use this test for correctness in a 
formal test function. 

First, we need to check that the simulation of multiple random walks reproduces 
the results of random_walk1D, random_walk1D_veci,and random_walk1iD_vec2 
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for the first walk, if the seed is the same. Second, we run three random walks (N=4) 
with the scalar and the two vectorized versions and check that the returned arrays 
are identical. 

For this type of test to be successful, we must be sure that exactly the same set 
of random numbers are used in the three versions, a fact that requires the same ran- 
dom number generator and the same seed, of course, but also the same sequence of 
computations. This is not obviously the case with the three random_walk1D* func- 
tions we have presented. The critical issue in random_walk1D_vect1 is that the first 
random numbers are used for the first walk, the second set of random numbers is 
used for the second walk and so on, to be compatible with how the random numbers 
are used in the function random_walk1D. For the function random_walk1D_vec2 
the situation is a bit more complicated since we generate all the random numbers 
at once. However, the critical step now is the reshaping of the array returned from 
np.where: we must reshape as (num_walks, N) to ensure that the first N random 
numbers are used for the first walk, the next N numbers are used for the second 
walk, and so on. 

We arrive at the test function below. 


def test_random_walks1D(): 
# For fixed seed, check that scalar and vectorized versions 
# produce the same result 
x0 =0; N=4; p=0.5 


# First, check that random_walksiD for 1 walk reproduces 
# the walk in random_walk1iD 
num_walks = 1 
np.random.seed(10) 
computed = random_walks1D( 
x0, N, p, num_walks, random=np.random) 
np.random.seed(10) 
expected = random_walk1D( 
x0, N, p, random=np.random) 
assert (computed[0] == expected) .all() 


# Same for vectorized versions 

np.random.seed(10) 

computed = random_walksiD_veci(x0, N, p, num_walks) 
np.random.seed(10) 

expected = random_walkiD_vec(x0, N, p) 

assert (computed[0] == expected) .all() 
np.random.seed(10) 

computed = random_walksiD_vec2(x0, N, p, num_walks) 
np.random.seed(10) 

expected = random_walkiD_vec(x0, N, p) 

assert (computed[0] == expected) .all() 


# Second, check multiple walks: scalar == vectorized 
num_walks = 3 
num_times = N 
np.random.seed(10) 
serial_computed = random_walks1D( 
x0, N, p, num_walks, num_times, random=np.random) 
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np.random.seed(10) 
vectorizedi_computed = random_walks1D_vec1( 
x0, N, p, num_walks, num_times) 
np.random.seed(10) 
vectorized2_computed = random_walks1D_vec2( 
x0, N, p, num_walks, num_times) 
# positions: [0, 1, 0, 1, 2] 
# Can test without tolerance since everything is +/- 1 
return_values = [’pos’, ’pos2’, ’pos_hist’, ’pos_hist_times’] 
for s, v, r in zip(serial_computed, 
vectorizedi_computed, 
return_values): 
msg = 74s: 4s (serial) vs 4s (vectorized)’ % (r, s, v) 
assert (s == v).all(), msg 
for s, v, r in zip(serial_computed, 
vectorized2_computed, 
return_values): 
msg = ’4/s: %s (serial) vs 4s (vectorized)’ % (r, s, v) 
assert (s == v).all(), msg 


Such test functions are indispensable for further development of the code as we 
can at any time test whether the basic computations remain correct or not. This 
is particularly important in stochastic simulations since without test functions and 
fixed seeds, we always experience variations from run to run, and it can be very 
difficult to spot bugs through averaged statistical quantities. 


3.7.6 Demonstration of Multiple Walks 


Assuming now that the code works, we can just scale up the number of steps in each 
walk and the number of walks. The latter influences the accuracy of the statistical 
estimates. Figure 3.18 shows the impact of the number of walks on the expectation, 
which should approach zero. Figure 3.19 displays the corresponding estimate of 
the variance of the position, which should grow linearly with the number of steps. 
It does, seemingly very accurately, but notice that the scale on the y axis is so much 
larger than for the expectation, so irregularities due to the stochastic nature of the 
process become so much less visible in the variance plots. The probability of find- 
ing a particle at a certain position at time (or step) 800 is shown in Fig. 3.20. The 
dashed red line is the theoretical distribution (3.116) arising from solving the dif- 
fusion equation (3.114) instead. As always, we realize that one needs significantly 
more statistical samples to estimate a histogram accurately than the expectation or 
variance. 


3.7.7 Ascii Visualization of 1D Random Walk 


If we want to study (very) long time series of random walks, it can be convenient 
to plot the position in a terminal window with the time axis pointing downwards. 
The module avplotter in SciTools has a class Plotter for plotting functions in 
the terminal window with the aid of ascii symbols only. Below is the code required 
to visualize a simple random walk, starting at the origin, and considered over when 
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Fig. 3.18 Estimated expected value for 1000 steps, using 100 walks (upper left), 10,000 (upper 
right), 100,000 (lower left), and 1,000,000 (lower right) 


the point x = —1 is reached. We use a spacing Ax = 0.05 (so x = —1 corresponds 
toi = —20). 


def run_random_walk(): 
from scitools.avplotter import Plotter 
import time, numpy as np 
p = Plotter(-1, 1, width=75) # Horizontal axis: 75 chars wide 
dx = 0.05 
np.random.seed(10) 


x =0 
while True: 
random_step = 1 if np.random.random() > 0.5 else -1 
x = x + dx*random_step 
alae eS Sle 
break # Destination reached! 
print p.plot(0, x) 


# Allow Ctrl+c to abort the simulation 
try: 

time.sleep(0.1) # Wait for interrupt 
except KeyboardInterrupt : 

print ’Interrupted by Ctrltc’ 

break 


302 3 Diffusion Equations 


Variance of position (100 walks) Variance of position (10000 walks) 
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Fig. 3.19 Estimated variance over 1000 steps, using 100 walks (upper left), 10,000 (upper right), 
100,000 (lower left), and 1,000,000 (lower right) 


Observe that we implement an infinite loop, but allow a smooth interrupt of the 
program by Ctrl+c through Python’s KeyboardInterrupt exception. This is a 
useful recipe that can be used in many occasions! 

The output looks typically like 


* | 
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Histogram of positions (100 walks) Histogram of positions (10000 walks) 
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Fig. 3.20 Estimated probability distribution at step 800, using 100 walks (upper left), 10,000 
(upper right), 100,000 (lower left), and 1,000,000 (lower right) 


Positions beyond the limits of the x axis appear with a value. A long file!? contains 
the complete ascii plot corresponding to the function run_random_walk above. 
3.7.8 Random Walk as a Stochastic Equation 


The (dimensionless) position in a random walk, X,, can be expressed as a stochastic 
difference equation: 


X= Xpits, x =0, (3.117) 
where s is a Bernoulli variable!', taking on the two values s = —1 and s = 1 with 
equal probability: 

1 1 
P(s=lh==, P(s=—-l)=-. 
G=)=5, Pe=-)=5 


The s variable in a step is independent of the s variable in other steps. 

The difference equation expresses essentially the sum of independent Bernoulli 
variables. Because of the central limit theorem, Xz, will then be normally dis- 
tributed with expectation kE[s] and kVar(s). The expectation and variance of a 
Bernoulli variable with values r = 0 andr = 1 are p and p(1 — p), respectively. 


10 http://bit.ly/1 UbULeH 
1 https://en.wikipedia.org/wiki/Bernoulli_distribution 
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The variable s = 2r — 1 then has expectation 2E[r] — 1 = 2p — 1 = 0 and variance 
2?Var(r) = 4p(1 — p) = 1. The position X; is normally distributed with zero 
expectation and variance k, as we found in Sect. 3.7.2. 

The central limit theorem tells that as long as k is not small, the distribution of 
Xk remains the same if we replace the Bernoulli variable s by any other stochastic 
variable with the same expectation and variance. In particular, we may let s be a 
standardized Gaussian variable (zero mean, unit variance). 

Dividing (3.117) by At gives 


Xk- Xr 1 


Ar At 


In the limit At — 0, s/At approaches a white noise stochastic process. With X (t) 
as the continuous process in the limit At —> 0 (Xk — X(t,)), we formally get the 
stochastic differential equation 


dX =dW, (3.118) 


where W(t) is a Wiener process!*. Then X is also a Wiener process. It follows from 
the stochastic ODE dX = d W that the probability distribution of X is given by the 
Fokker-Planck equation! (3.114). In other words, the key results for random walk 
we found earlier can alternatively be derived via a stochastic ordinary differential 
equation and its related Fokker-Planck equation. 


3.7.9 Random Walk in 2D 


The most obvious generalization of 1D random walk to two spatial dimensions is 
to allow movements to the north, east, south, and west, with equal probability L, 


def random_walk2D(x0, N, p, random=random) : 
"""2D random walk with 1 particle and N moves: N, E, W, S-in 
# Store position in step k in position[k] 
d = len(x0) 
position = np.zeros((N+1, d)) 
position[0,:] = x0 
current_pos = np.array(x0, dtype=float) 
for k in range(N): 
r = random.uniform(0, 1) 
if r <= 0.25: 
current_pos += np.array([0, 1])  # Move north 
elif 0.25 < r <= 0.5: 
current_pos += np.array([1, 0]) # Move east 
elif 0.5 < r <= 0.75: 
current_pos += np.array([0, -1]) # Move south 
else: 
current_pos += np.array([-1, 0]) # Move west 
position[k+1,:] = current_pos 
return position 


12 https://en.wikipedia.org/wiki/Wiener_process 
13 https://en.wikipedia.org/wiki/Fokker-Planck_equation 
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Fig. 3.21 Random walks in 2D with 200 steps: rectangular mesh (left) and diagonal mesh (right) 


The left plot in Fig. 3.21 provides an example on 200 steps with this kind of walk. 
We may refer to this walk as a walk on a rectangular mesh as we move from any 
spatial mesh point (7, 7) to one of its four neighbors in the rectangular directions: 


@+1,7),@-1,7),@,j7 +1), or G7 — 1). 


3.7.10 Random Walk in Any Number of Space Dimensions 


From a programming point of view, especially when implementing a random walk 
in any number of dimensions, it is more natural to consider a walk in the diagonal 
directions NW, NE, SW, and SE. On a two-dimensional spatial mesh it means that 
we go from (i, j ) to either (i +1, j +1), (@—-1, j +1), @+1, 7-1), or @—-1, j—1). 
We can with such a diagonal mesh (see right plot in Fig. 3.21) draw a Bernoulli 
variable for the step in each spatial direction and trivially write code that works in 
any number of spatial directions: 


def random_walkdD(x0, N, p, random=random) : 
"""Any-D (diagonal) random walk with 1 particle and N moves.""" 
# Store position in step k in position[k] 
d = len(x0) 
position = np.zeros((N+1, d)) 
position[0,:] = x0 
current_pos = np.array(x0, dtype=float) 
for k in range(N): 
for i in range(d): 
r = random.uniform(0, 1) 
ifin <p 
current_pos[i] -= 1 
else: 
current_pos[i] += 1 
position[k+1,:] = current_pos 
return position 


A vectorized version is desired. We follow the ideas from Sect. 3.7.3, but each 
step is now a vector in d spatial dimensions. We therefore need to draw Nd random 
numbers in r, compute steps in the various directions through np.where(r <=p, 


306 3 Diffusion Equations 


-100 


-120 $ 1 1 1 1 1 1 fi fi 1 J fi fi fi fi fi 
-100 -80 -60 -40 -20 0 20 40 60 80 -100 -80 -60 -40 -20 0 20 40 60 80 


Fig. 3.22 Four random walks with 5000 steps in 2D 


-1, 1) (each step being —1 or 1), and then we can reshape this array to an N x d 
array of step vectors. Doing an np.cumsum summation along axis 0 will add the 
vectors, as this demo shows: 


>>> a = np.arange(6) .reshape (3,2) 


>> a 

array LIO 1], 
2, 31, 
[4, 5]]) 


>>> np.cumsum(a, axis=0) 
array([[ 0, 4], 

EL 2, 4), 

[ 6, 9]]) 


With such summation of step vectors, we get all the positions to be filled in the 
position array: 


def random_walkdD_vec(x0, N, p): 
"""Vectorized version of random_walkdD.""" 
d = len(x0) 
# Store position in step k in position[k] 
position = np.zeros((N+1,d)) 
position[0] = np.array(x0, dtype=float) 
r = np.random.uniform(0, 1, size=N*d) 
steps = np.where(r <= p, -1, 1).reshape(N,d) 
position[1:,:] = x0 + np.cumsum(steps, axis=0) 
return position 
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3.7.11 Multiple Random Walks in Any Number of Space Dimensions 


As we did in 1D, we extend one single walk to a number of walks (num_walks in 
the code). 


Scalar code As always, we start with implementing the scalar case: 


def random_walksdD(x0, N, p, num_walks=1, num_times=1, 
random=random) : 

"""Simulate num_walks random walks from x0 with N steps.""" 
d = len(x0) 
position = np.zeros((Nt1, d))  # Accumulated positions 
position2 = np.zeros((N+1, d))  # Accumulated positions**2 
# Histogram at num_times selected time points 
pos_hist = np.zeros((num_walks, num_times, d)) 
pos_hist_times = [(N//num_times)*i for i in range(num_times)] 


for n in range(num_walks): 
num_times_counter = 0 
current_pos = np.array(x0, dtype=float) 
for k in range(N): 
if k in pos_hist_times: 
pos_hist[n,num_times_counter,:] = current_pos 
num_times_counter += 1 
# current_pos corresponds to step k+1 
for i in range(d): 
r = random.uniform(0, 1) 
if r <= p: 
current_pos[i] -= 1 
else: 
current_pos[i] += 1 
position [k+1,:] += current_pos 
position2[k+1,:] += current_pos**2 
return position, position2, pos_hist, np.array(pos_hist_times) 


Vectorized code Significant speed-ups can be obtained by vectorization. We get 
rid of the loops in the previous function and arrive at the following vectorized code. 


def random_walksdD_vec(x0, N, p, num_walks=1, num_times=1): 
"""Vectorized version of random_walksiD; no loops.""" 
d = len(x0) 
position = np.zeros((Nt1, d)) # Accumulated positions 
position2 = np.zeros((N+1, d)) # Accumulated positions**2 
walks = np.zeros((num_walks, N+1, d)) # Positions of each walk 
walks[:,0,:] = x0 
# Histogram at num_times selected time points 
pos_hist = np.zeros((num_walks, num_times, d)) 
pos_hist_times = [(N//num_times)*i for i in range(num_times)] 


r = np.random.uniform(0, 1, size=N*num_walks*d) 
steps = np.where(r <= p, -1, 1).reshape(num_walks, N, d) 
walks[:,1:,:] = x0 + np.cumsum(steps, axis=1) 


position = np.sum(walks, axis=0) 
position2 = np.sum(walks**2, axis=0) 
pos_hist[:,:,:] = walks[:,pos_hist_times, :] 


return position, position2, pos_hist, np.array(pos_hist_times) 
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3.8 Applications 
3.8.1 Diffusion of a Substance 


The first process to be considered is a substance that gets transported through a 
fluid at rest by pure diffusion. We consider an arbitrary volume V of this fluid, 
containing the substance with concentration function c(x,t). Physically, we can 
think of a very small volume with centroid x at time ¢ and assign the ratio of the 
volume of the substance and the total volume to c(x,t). This means that the mass 
of the substance in a small volume AV is approximately gcAV, where ọ is the 
density of the substance. Consequently, the total mass of the substance inside the 
volume V is the sum of all gc AV, which becomes the volume integral f yecdV. 
Let us reason how the mass of the substance changes and thereby derive a PDE 
governing the concentration c. Suppose the substance flows out of V with a flux q. 
If AS is a small part of the boundary dV of V, the volume of the substance flowing 
out through d'S in a small time interval At is og -nAtAS, where n is an outward 
unit normal to the boundary OV, see Fig. 3.23. We realize that only the normal 
component of q is able to transport mass in and out of V. The total outflow of the 
mass of the substance in a small time interval At becomes the surface integral 


J og: -nAtds. 
av 
Assuming conservation of mass, this outflow of mass must be balanced by a loss 
of mass inside the volume. The increase of mass inside the volume, during a small 
time interval Af, is 
focas + At)—c(x,t))dV, 
V 


assuming ọ is constant, which is reasonable. The outflow of mass balances the loss 
of mass in V, which is the increase with a minus sign. Setting the two contributions 
equal to each other ensures balance of mass inside V. Dividing by Af gives 


At) — 

fess t) Day = — f oq-nas. 
At 

V av 


Note the minus sign on the right-hand side: the left-hand side expresses loss of 
mass, while the integral on the right-hand side is the gain of mass. 


Fig. 3.23 An arbitrary vol- q 
ume of a fluid 
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Now, letting At — 0, we have 


c(x,t + At) —c(x,t) dc 
= ÉJ 
At ot 


So 


ð 
f o%av + f og-nas =0. (3.119) 
V OV 


To arrive at a PDE, we express the surface integral as a volume integral using Gauss’ 
divergence theorem: 
dc 
[ (+v en)a = 0. 
V 


Since ọ is constant, we can divide by this quantity. If the integral is to vanish for an 
arbitrary volume V, the integrand must vanish too, and we get the mass conservation 
PDE for the substance: 5 
£ og =O, (3.120) 
ot 

A fundamental problem is that this is a scalar PDE for four unknowns: c and 
the three components of q. We therefore need additional equations. Here, Fick’s 
law comes at rescue: it models how the flux q of the substance is related to the 
concentration c. Diffusion is recognized by mass flowing from regions with high 
concentration to regions of low concentration. This principle suggests that q is 
proportional to the negative gradient of c: 


q = —a Vc, (3.121) 


where q@ is an empirically determined constant. The relation (3.121) is known as 
Fick’s law. Inserting (3.121) in (3.120) gives a scalar PDE for the concentration c: 


—=aV'c. (3.122) 


3.8.2 Heat Conduction 


Heat conduction is a well-known diffusion process. The governing PDE is in this 
case based on the first law of thermodynamics: the increase in energy of a system 
is equal to the work done on the system, plus the supplied heat. Here, we shall 
consider media at rest and neglect work done on the system. The principle then 
reduces to a balance between increase in internal energy and supplied heat flow by 
conduction. 

Let e(x,t) be the internal energy per unit mass. The increase of the internal 
energy in a small volume AV in a small time interval Aż is then 


o(e(x,t + At) —e(x,t))AV, 


310 3 Diffusion Equations 


where ọ is the density of the material subject to heat conduction. In an arbitrary 
volume V, as depicted in Fig. 3.23, the corresponding increase in internal energy 
becomes the volume integral 


[eect + At)—e(x,t))dV. 
v 


This increase in internal energy is balanced by heat supplied by conduction. Let q 
be the heat flow per time unit. Through the surface dV of V the following amount 
of heat flows out of V during a time interval Ar: 


[arnacas. 
av 


The simplified version of the first law of thermodynamics then states that 
foecs + At)—e(x,t))dV = - f anaras. 
V av 


The minus sign on the right-hand side ensures that the integral there models net 
inflow of heat (since n is an outward unit normal, q -n models outflow). Dividing 
by At and notifying that 


. e(x,t+At)—e(x,t) ðe 
lim = ; 
At—0 At ot 


we get (in the limit At > 0) 


fesav+ fa-naras =o. 


V əv 


This is the integral equation for heat conduction, but we aim at a PDE. The next 
step is therefore to transform the surface integral to a volume integral via Gauss’ 
divergence theorem. The result is 


If this equality is to hold for all volumes V, the integrand must vanish, and we have 


the PDE 4 
e 

—=-V-q. 3.123 

Qn q ( ) 


Sometimes the supplied heat can come from the medium itself. This is the case, 
for instance, when radioactive rock generates heat. Let us add this effect. If f(x, t) 
is the supplied heat per unit volume per unit time, the heat supplied in a small 
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volume is fAtAV, and inside an arbitrary volume V the supplied generated heat 


becomes 
Í fAtdV. 
V 


Adding this to the integral statement of the (simplified) first law of thermodynamics, 
and continuing the derivation, leads to the PDE 


de 
ream ea (3.124) 
There are four unknown scalar fields: e and q. Moreover, the temperature T, 
which is our primary quantity to compute, does not enter the model yet. We need 
an additional equation, called the equation of state, relating e, V = 1/9 =, and T: 
e = e(V,T). By the chain rule we have 


de | de 
dt ƏT 


L de 
y ot OV 


OV 
rôt 


The first coefficient ðe /ƏT is called specific heat capacity at constant volume, de- 
noted by c,: 
de 
Cy = 


ƏT 


4 
The specific heat capacity will in general vary with T, but taking it as a constant is 
a good approximation in many applications. 

The term de/dV models effects due to compressibility and volume expansion. 
These effects are often small and can be neglected. We shall do so here. Using 
de/dt = c,0T/dt in the PDE gives 


oT 
Dey ag 5 Vet. 


We still have four unknown scalar fields (T and q). To close the system, we need a 
relation between the heat flux g and the temperature T called Fourier’s law: 


q = —kVT, 


which simply states that heat flows from hot to cold areas, along the path of greatest 
variation. In a solid medium, k depends on the material of the medium, and in multi- 
material media one must regard k as spatially dependent. In a fluid, it is common 
to assume that k is constant. The value of k reflects how easy heat is conducted 
through the medium, and k is named the coefficient of heat conduction. 

We now have one scalar PDE for the unknown temperature field T (x, t): 


ƏT 
oco =V: EVT) + f. (3.125) 
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3.8.3 Porous Media Flow 


The requirement of mass balance for flow of a single, incompressible fluid through 
a deformable (elastic) porous medium leads to the equation 


where p is the fluid pressure, q is the fluid velocity, u is the displacement (defor- 
mation) of the medium, S is the storage coefficient of the medium (related to the 
compressibility of the fluid and the material in the medium), and « is another coeffi- 
cient. In many circumstances, the last term with u can be neglected, an assumption 
that decouples the equation above from a model for the deformation of the medium. 
The famous Darcy’s law relates q to p: 


K 
q = ——(Vp— ag), 
H 


where K is the permeability of the medium, ju is the dynamic viscosity of the fluid, 
o is the density of the fluid, and g is the acceleration of gravity, here taken as 


g = —gk. Combining the two equations results in the diffusion model 
0 OK 
S2 -p vK ES, (3.126) 
ot u OZ 


Boundary conditions consist of specifying p or q - n (i.e., normal velocity) at each 
point of the boundary. 


3.8.4 Potential Fluid Flow 


Let v be the velocity of a fluid. The condition V x v = 0 is relevant for many flows, 
especially in geophysics when viscous effects are negligible. From vector calculus 
it is known that V x v = 0 implies that v can be derived from a scalar potential field 
go: v = V¢. If the fluid is incompressible, V - v = 0, it follows that V - V = 0, or 


Vp = 0. (3.127) 


This Laplace equation is sufficient for determining @ and thereby describe the fluid 
motion. This type of flow is known as potential flow!*. One very important appli- 
cation where potential flow is a good model is water waves. As boundary condition 
we must prescribe v -n = 0¢/dn. This gives rise to what is known as a pure 
Neumann problem and will cause numerical difficulties because @ and ¢@ plus any 
constant are two solutions of the problem. The simplest remedy is to fix the value 
of ¢ at a point. 


14 https://en.wikipedia.org/wiki/Potential_flow 
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3.8.5 Streamlines for 2D Fluid Flow 


The streamlines in a two-dimensional stationary fluid flow are lines tangential to 
the flow. The stream function!» y is often introduced in two-dimensional flow such 
that its contour lines, Y = const, gives the streamlines. The relation between y 
and the velocity field v = (u, v) is 


It follows that Vv = Wyx — Wx) = 0, so the stream function can only be used for 
incompressible flows. Since 


we can derive the relation 
Vey =o, (3.128) 


which is a governing equation for the stream function w(x, y) if the vorticity œw is 
known. 


3.8.6 The Potential of an Electric Field 


Under the assumption of time independence, Maxwell’s equations for the electric 
field E become 


V-E= £, 
€0 
VxE=0, 


where p is the electric charge density and €ọ is the electric permittivity of free space 
(i.e., vacuum). Since V x E = 0, E can be derived from a potential g, E = —Vọ. 
The electric field potential is therefore governed by the Poisson equation 


Vo =-—. (3.129) 


If the medium is heterogeneous, p will depend on the spatial location r. Also, 
€o must be exchanged with an electric permittivity function €(r). 
Each point of the boundary must be accompanied by, either a Dirichlet condition 
g(r) 


g(r) = gp(r), or a Neumann condition =~ = gy(r). 


3.8.7 Development of Flow Between Two Flat Plates 


Diffusion equations may also arise as simplified versions of other mathematical 
models, especially in fluid flow. Consider a fluid flowing between two flat, parallel 


'S https://en.wikipedia.org/wiki/Stream_function 
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plates. The velocity is uni-directional, say along the z axis, and depends only on 
the distance x from the plates; u = u(x,t)k. The flow is governed by the Navier- 
Stokes equations, 


ou 
o> + ou-Vu=—-Vp + Vu t of, 
V-u=0, 


where p is the pressure field, unknown along with the velocity u, ọ is the fluid 
density, jz the dynamic viscosity, and f is some external body force. The geo- 
metric restrictions of flow between two flat plates puts restrictions on the velocity, 
u = u(x,t)i, and the z component of the Navier-Stokes equations collapses to a 
diffusion equation: 
du dp du 
Co az via + ef» 


if f- is the component of f in the z direction. 

The boundary conditions are derived from the fact that the fluid sticks to the 
plates, which means u = 0 at the plates. Say the location of the plates are z = 0 
and z = L. We then have 


u(0,t) = u(L,t) = 0. 


One can easily show that ðp/ðz must be a constant or just a function of time 
t. We set dp/dz = —f(t). The body force could be a component of gravity, if 
desired, set as f, = yg. Switching from z to x as independent variable gives a very 
standard one-dimensional diffusion equation: 


ðu 3u 
o= =u +p) + eye; xe [0,L], te (0,T]. 
ot Ox 
The boundary conditions are 
u(0,t) = u(L,t) = 0, 


while some initial condition 
u(x,0) = I(x) 


must also be prescribed. 

The flow is driven by either the pressure gradient f or gravity, or a combination 
of both. One may also consider one moving plate that drives the fluid. If the plate 
at x = L moves with velocity Uz (t), we have the adjusted boundary condition 


u(L,t) = U(t). 


3.8.8 Flow in a Straight Tube 


Now we consider viscous fluid flow in a straight tube with radius R and rigid walls. 
The governing equations are the Navier-Stokes equations, but as in Sect. 3.8.7, it 
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is natural to assume that the velocity is directed along the tube, and that it is axi- 
symmetric. These assumptions reduced the velocity field to u = u(r, x,t)i, if the 
x axis is directed along the tube. From the equation of continuity, V - u = 0, we 
see that u must be independent of x. Inserting u = u(r,t)i in the Navier-Stokes 
equations, expressed in axi-symmetric cylindrical coordinates, results in 


ðu x (re 
= r 


=) + B(t)+eyvg, re [0, R], te(0,T]. (3.130) 


Here, (t) = —dp/dx is the pressure gradient along the tube. The associated 
boundary condition is u(R,t) = 0. 


3.8.9 Tribology: Thin Film Fluid Flow 


Thin fluid films are extremely important inside machinery to reduce friction be- 
tween gliding surfaces. The mathematical model for the fluid motion takes the 
form of a diffusion problem and is quickly derived here. We consider two solid 
surfaces whose distance is described by a gap function A(x, y). The space between 
these surfaces is filled with a fluid with dynamic viscosity jz. The fluid may move 
partially because of pressure gradients and partially because the surfaces move. Let 
Ui + Vj be the relative velocity of the two surfaces and p the pressure in the 
fluid. The mathematical model builds on two principles: 1) conservation of mass, 
2) assumption of locally quasi-static flow between flat plates. 

The conservation of mass equation reads V -u, where u is the local fluid velocity. 
For thin films the detailed variation between the surfaces is not of interest, so V-u = 
0 is integrated (average) in the direction perpendicular to the surfaces. This gives 
rise to the alternative mass conservation equation 


h(x,y) 
V-q=90, q= J udz, 
0 


where z is the coordinate perpendicular to the surfaces, and g is then the volume 
flux in the fluid gap. 

Locally, we may assume that we have steady flow between two flat surfaces, with 
a pressure gradient and where the lower surface is at rest and the upper moves with 
velocity Ui + Vj. The corresponding mathematical problem is actually the limit 
problem in Sect. 3.8.7 as t — oo. The limit problem can be solved analytically, and 
the local volume flux becomes 


h 


h? 1 1 
q(x, y, z) = fee yaaz =———Vp+ -Uhi + <Vhj. 
12u 2 2 
0 


The idea is to use this expression locally also when the surfaces are not flat, but 
slowly varying, and if U, V, or p varies in time, provided the time variation is 
sufficiently slow. This is a common quasi-static approximation, much used in math- 
ematical modeling. 
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Inserting the expression for q via p, U, and V in the equation Vq = 0 gives a 
diffusion PDE for p: 


h? 13 19a 
Vole) == GS Sa 131 
(a r) oa toa enn 


The boundary conditions must involve p or q at the boundary. 


3.8.10 Propagation of Electrical Signals in the Brain 


One can make a model of how electrical signals are propagated along the neu- 
ronal fibers that receive synaptic inputs in the brain. The signal propagation is 
one-dimensional and can, in the simplest cases, be governed by the Cable equa- 


tion!®: 


E R ey (3.132) 

™ Ot ry Ox2 Tm f 
where V(x,t) is the voltage to be determined, c,, is capacitance of the neuronal 
fiber, while r; and r,,, are measures of the resistance. The boundary conditions are 
often taken as V = 0 at a short circuit or open end, dV/dx = 0 at a sealed end, or 


dV/dx x V where there is an injection of current. 


3.9 Exercises 


Exercise 3.6: Stabilizing the Crank-Nicolson method by Rannacher time 
stepping 

It is well known that the Crank-Nicolson method may give rise to non-physical 
oscillations in the solution of diffusion equations if the initial data exhibit jumps 
(see Sect. 3.3.6). Rannacher [15] suggested a stabilizing technique consisting of 
using the Backward Euler scheme for the first two time steps with step length HAt. 
One can generalize this idea to taking 2m time steps of size At with the Backward 
Euler method and then continuing with the Crank-Nicolson method, which is of 
second-order in time. The idea is that the high frequencies of the initial solution are 
quickly damped out, and the Backward Euler scheme treats these high frequencies 
correctly. Thereafter, the high frequency content of the solution is gone and the 
Crank-Nicolson method will do well. 

Test this idea for m = 1,2, 3 on a diffusion problem with a discontinuous initial 
condition. Measure the convergence rate using the solution (3.45) with the bound- 
ary conditions (3.46)—(3.47) for t values such that the conditions are in the vicinity 
of +1. For example, £ < 5a1.6- 107? makes the solution diffusion from a step to 
almost a straight line. The program diffu_erf_sol.py shows how to compute 
the analytical solution. 


Project 3.7: Energy estimates for diffusion problems 
This project concerns so-called energy estimates for diffusion problems that can be 


used for qualitative analytical insight and for verification of implementations. 


'6 http://en.wikipedia.org/wiki/Cable_equation 
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a) 


b) 


We start with a 1D homogeneous diffusion equation with zero Dirichlet condi- 
tions: 


Us = QUxX, x € QR = (0, L), t e (0, T], (3.133) 
u(0,t) = u(L,t) = 0, t e (0,T], (3.134) 
u(x,0) = I(x), x € [0, L]. (3.135) 


The energy estimate for this problem reads 


[ull < [lz (3.136) 
where the || - ||z2 norm is defined by 
L 
ligil = | f gax. (3.137) 
0 


The quantify ||u||z2 or 5 ||u| |z2 is known as the energy of the solution, although 
it is not the physical energy of the system. A mathematical tradition has intro- 
duced the notion energy in this context. 

The estimate (3.136) says that the “size of u” never exceeds that of the initial 
condition, or more precisely, it says that the area under the u curve decreases 
with time. 

To show (3.136), multiply the PDE by u and integrate from 0 to L. Use that uu, 
can be expressed as the time derivative of u? and that uxu can integrated by 
parts to form an integrand u2. Show that the time derivative of ||u| H must be 
less than or equal to zero. Integrate this expression and derive (3.136). 


Now we address a slightly different problem, 
uU, = ux + f(x,t), x € Q = (0, L), t e (0, T], (3.138) 
u(0,t) = u(L,t) = 0, t € (0,7), (3.139) 
u(x,0) = 0, x € [0, L]. (3.140) 


The associated energy estimate is 


[lullz2 < FIle- (3.141) 


(This result is more difficult to derive.) 
Now consider the compound problem with an initial condition Z (x) and a right- 
hand side f(x, t): 


uU, = uxx + f(x,t), x € QR = (0, L), t € (0, T], (3.142) 
u(0,t) = u(L,t) = 0, t € (0,T], (3.143) 
u(x, 0) = I(x), x € [0, L]. (3.144) 


Show that if w, fulfills (3.133)-(3.135) and w, fulfills (3.138)-(3.140), then 
u = WwW, + w3 is the solution of (3.142)—-(3.144). Using the triangle inequality 
for norms, 

Ila + || < |la|| + Ilb], 
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show that the energy estimate for (3.142)-(3.144) becomes 


Julle < Illz + WA lee - (3.145) 


c 


wm 


One application of (3.145) is to prove uniqueness of the solution. Suppose uv; 
and wu both fulfill (3.142)—(3.144). Show that u = u; — > then fulfills (3.142)— 
(3.144) with f = 0 and J = 0. Use (3.145) to deduce that the energy must be 
zero for all times and therefore that u} = u2, which proves that the solution is 
unique. 

d) Generalize (3.145) to a 2D/3D diffusion equation u; = V - (aVu) for x € 2. 


Hint Use integration by parts in multi dimensions: 


[uv evinar = - favu-vuar fua, 
Q Q 


on 
an 


where x = n ; Vu, n being the outward unit normal to the boundary 92 of the 


domain £2. 


e) Now we also consider the multi-dimensional PDE u; = V - (a#Vu). Integrate 
both sides over 2 and use Gauss’ divergence theorem, f o V-4 dx = J 90 nds 
for a vector field q. Show that if we have homogeneous Neumann conditions 
on the boundary, ðu /ðn = 0, area under the u surface remains constant in time 


and 
[usr fra. (3.146) 


R R 


f) Establish a code in 1D, 2D, or 3D that can solve a diffusion equation with a 

source term f, initial condition Z, and zero Dirichlet or Neumann conditions on 
the whole boundary. 
We can use (3.145) and (3.146) as a partial verification of the code. Choose 
some functions f and J and check that (3.145) is obeyed at any time when zero 
Dirichlet conditions are used. Iterate over the same J functions and check that 
(3.146) is fulfilled when using zero Neumann conditions. 

g) Make a list of some possible bugs in the code, such as indexing errors in ar- 
rays, failure to set the correct boundary conditions, evaluation of a term at a 
wrong time level, and similar. For each of the bugs, see if the verification tests 
from the previous subexercise pass or fail. This investigation shows how strong 
the energy estimates and the estimate (3.146) are for pointing out errors in the 
implementation. 


Filename: diffu_energy. 
Exercise 3.8: Splitting methods and preconditioning 
In Sect. 3.6.15, we outlined a class of iterative methods for Au = b based on 


splitting A into A = M — N and introducing the iteration 


Muk = Nuk +b. 
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The very simplest splitting is M = J, where J is the identity matrix. Show that 
this choice corresponds to the iteration 


ut yk ly pl rl = b Aut, (3.147) 
where r*~! is the residual in the linear system in iteration k — 1. The formula 
(3.147) is known as Richardson’s iteration. Show that if we apply the simple iter- 
ation method (3.147) to the preconditioned system M~' Au = M~'b, we arrive at 
the Jacobi method by choosing M = D (the diagonal of A) as preconditioner and 
the SOR method by choosing M = œ~! D + L (L being the lower triangular part 
of A). This equivalence shows that we can apply one iteration of the Jacobi or SOR 
method as preconditioner. 


Problem 3.9: Oscillating surface temperature of the earth 

Consider a day-and-night or seasonal variation in temperature at the surface of the 
earth. How deep down in the ground will the surface oscillations reach? For sim- 
plicity, we model only the vertical variation along a coordinate x, where x = 0 
at the surface, and x increases as we go down in the ground. The temperature is 
governed by the heat equation 


oT 
Qua =V- (KVT), 


in some spatial domain x € [0, L], where L is chosen large enough such that we can 
assume that T is approximately constant, independent of the surface oscillations, for 
x > L. The parameters ọ, cy, and k are the density, the specific heat capacity at 
constant volume, and the heat conduction coefficient, respectively. 


a) Derive the mathematical model for computing T (x,t). Assume the surface os- 
cillations to be sinusoidal around some mean temperature Tn. Let T = Tm 
initially. At x = L, assume T ~ Tm. 

b) Scale the model in a) assuming k is constant. Use a time scale te = o~! anda 
length scale xe = /2a/w, where a = k/(oc,). The primary unknown can be 
scaled as 2! | 
Show that the scaled PDE is 


ou _ 1 3u 
Ə 28x? 


with initial condition u(x,0) = 0, left boundary condition u(0,7) = sin(f), 
and right boundary condition u(L,f) = 0. The bar indicates a dimensionless 
quantity. 

Show that u(x,7) = e™* sin(X — Í) is a solution that fulfills the PDE and the 
boundary condition at x = 0 (this is the solution we will experience as f —> oo 
and L — oo). Conclude that an appropriate domain for x is [0, 4] if a damping 
e+ x 0.18 is appropriate for implementing 7 ~ const; increasing to [0, 6] 
damps u to 0.0025. 

Compute the scaled temperature and make animations comparing two solutions 
with L = 4 and L = 8, respectively (keep Ax the same). 


Cc 


wm 
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Problem 3.10: Oscillating and pulsating flow in tubes 

We consider flow in a straight tube with radius R and straight walls. The flow is 
driven by a pressure gradient f(t). The effect of gravity can be neglected. The 
mathematical problem reads 


du 19 du 
V5. ( =) + B(t), r € [0, R], t e (0,T], (3.148) 
u(r,0) = I(r), r € [0, R], (3.149) 
u(R,t) = 0, t e (0,T], (3.150) 
M = 0, t e (0,T]. (3.151) 
or 


We consider two models for f(t). One plain, sinusoidal oscillation: 
P = Asin(@t), (3.152) 
and one with periodic pulses, 
b = Asin (œt). (3.153) 


Note that both models can be written as 6B = A sin” (wt), with m = 1 and m = 16, 
respectively. 


a) Scale the mathematical model, using the viscous time scale ọR?°/ p. 

b) Implement the scaled model from a), using the unifying 0 scheme in time and 
centered differences in space. 

c) Verify the implementation in b) using a manufactured solution that is quadratic 
in r and linear in t. Make a corresponding test function. 


Hint You need to include an extra source term in the equation to allow for such 
tests. Let the spatial variation be 1 — r° such that the boundary condition is fulfilled. 


d) Make animations form = 1,16 and œ = 1,0.1. Choose T such that the motion 
has reached a steady state (non-visible changes from period to period in u). 

e) For a > 1, the scaling in a) is not good, because the characteristic time for 
changes (due to the pressure) is much smaller than the viscous diffusion time 
scale (a becomes large). We should in this case base the short time scale on 
1/@. Scale the model again, and make an animation for m = 1,16 anda = 10. 


Filename: axisymm_flow. 


Problem 3.11: Scaling a welding problem 

Welding equipment makes a very localized heat source that moves in time. We 
shall investigate the heating due to welding and choose, for maximum simplicity, a 
one-dimensional heat equation with a fixed temperature at the ends, and we neglect 
melting. We shall scale the problem, and besides solving such a problem numeri- 
cally, the aim is to investigate the appropriateness of alternative scalings. 
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The governing PDE problem reads 


ou u 
T e x € (0, L), t € (0, T), 
u(x,0) = U,, x € [0, L], 


u(0,t) =u(L,t)=0, te (0,7). 


Here, u is the temperature, ọ the density of the material, c a heat capacity, k the 
heat conduction coefficient, f is the heat source from the welding equipment, and 
U, is the initial constant (room) temperature in the material. 

A possible model for the heat source is a moving Gaussian function: 


1 (x—vt\’ 
f = Aexp | —- : 
2 o 
where A is the strength, o is a parameter governing how peak-shaped (or localized 


in space) the heat source is, and v is the velocity (in positive x direction) of the 
source. 


a) Let Xe, te, Ue, and fe be scales, i.e., characteristic sizes, of x, t, u, and f, 
respectively. The natural choice of x, and fẹ is L and A, since these make the 
scaled x and f in the interval [0, 1]. If each of the three terms in the PDE are 
equally important, we can find t and u, by demanding that the coefficients in 
the scaled PDE are all equal to unity. Perform this scaling. Use scaled quantities 
in the arguments for the exponential function in f too and show that 


f= eo 2B ivi? 
where f and y are dimensionless numbers. Give an interpretation of 6 and y. 


Argue that for large y we should base the time scale on the movement of the 
heat source. Show that this gives rise to the scaled PDE 


b 


~ 


and 


Discuss when the scalings in a) and b) are appropriate. 

One aim with scaling is to get a solution that lies in the interval [—1, 1]. This is 
not always the case when uc is based on a scale involving a source term, as we 
do in a) and b). However, from the scaled PDE we realize that if we replace f 
with ô f , where ô is a dimensionless factor, this corresponds to replacing u, by 
u-/6. So, if we observe that u ~ 1/6 in simulations, we can just replace f by 
ôf in the scaled PDE. 

Use this trick and implement the two scaled models. Reuse software for the 
diffusion equation (e.g., the solver function in diffulD_vc. py). Make a func- 
tion run(gamma, beta=10, delta=40, scaling=1, animate=False) 


c 


wa 
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that runs the model with the given y, 6, and ô parameters as well as an indicator 
scaling that is 1 for the scaling in a) and 2 for the scaling in b). The last 
argument can be used to turn screen animations on or off. 

Experiments show that with y = 1 and 6 = 10, 6 = 20 is appropriate. Then 
max |u| will be larger than 4 for y = 40, but that is acceptable. R 

Equip the run function with visualization, both animation of u and f, and plots 
with ù and f fort = 0.2 and t = 0.5. 


Hint Since the amplitudes of u and f differs by a factor ô, it is attractive to plot 
f /6 together with ù. 


d) Use the software in c) to investigate y = 0.2,1,5,40 for the two scalings. 
Discuss the results. 


Filename: welding. 


Exercise 3.12: Implement a Forward Euler scheme for axi-symmetric 
diffusion 

Based on the discussion in Sect. 3.5.6, derive in detail the discrete equations for a 
Forward Euler in time, centered in space, finite difference method for axi-symmetric 
diffusion. The diffusion coefficient may be a function of the radial coordinate. At 
the outer boundary r = R, we may have either a Dirichlet or Robin condition. 
Implement this scheme. Construct appropriate test problems. 

Filename: FE_axisym. 
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Advection-Dominated Equations 


Wave (Chap. 2) and diffusion (Chap. 3) equations are solved reliably by finite 
difference methods. As soon as we add a first-order derivative in space, repre- 
senting advective transport (also known as convective transport), the numerics gets 
more complicated and intuitively attractive methods no longer work well. We shall 
show how and why such methods fail and provide remedies. The present chapter 
builds on basic knowledge about finite difference methods for diffusion and wave 
equations, including the analysis by Fourier components, truncation error analysis 
(Appendix B), and compact difference notation. 


Remark on terminology 

It is common to refer to movement of a fluid as convection, while advection is the 
transport of some material dissolved or suspended in the fluid. We shall mostly 
choose the word advection here, but both terms are in heavy use, and for mass 
transport of a substance the PDE has an advection term, while the similar term 
for the heat equation is a convection term. 


Much more comprehensive discussion of dispersion analysis for advection prob- 
lems can be found in the book by Duran [3]. This is a an excellent resource for 
further studies on the topic of advection PDEs, with emphasis on generalizations to 
real geophysical problems. The book by Fletcher [4] also has a good overview of 
methods for advection and convection problems. 


4.1 One-Dimensional Time-Dependent Advection Equations 


We consider the pure advection model 


a x €(0,L), t € (0,T], (4.1) 
ot ax 
u(x,0) = I(x), x € (0, L), (4.2) 
u(0,t) = Uo, t e (0, T]. (4.3) 


In (4.1), v is a given parameter, typically reflecting the transport velocity of a quan- 
tity u with a flow. There is only one boundary condition (4.3) since the spatial 
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derivative is only first order in the PDE (4.1). The information at x = 0 and the 
initial condition get transported in the positive x direction if v > 0 through the 
domain. 

It is easiest to find the solution of (4.1) if we remove the boundary condition and 
consider a process on the infinite domain (—oo, oo). The solution is simply 


u(x,t) = I(x — vt). (4.4) 


This is also the solution we expect locally in a finite domain before boundary con- 
ditions have reflected or modified the wave. 
A particular feature of the solution (4.4) is that 


U(X;,t41) = U(Xj-1. tn), (4.5) 
if x; = i Ax and t, = nAt are points in a uniform mesh. We see this relation from 


u(@iAx,(n + 1)At) = TG Ax — v(n + 1)At) 
= [((i — 1)Ax — vn At — vAt + Ax) 
= Į((i — 1)Ax — vn At) 
= u((i — 1)Ax,nAt), 


provided v = Ax/At. So, whenever we see a scheme that collapses to 


urt =u", (4.6) 
for the PDE in question, we have in fact a scheme that reproduces the analytical 
solution, and many of the schemes to be presented possess this nice property! 

Finally, we add that a discussion of appropriate boundary conditions for the ad- 
vection PDE in multiple dimensions is a challenging topic beyond the scope of this 
text. 


4.1.1 Simplest Scheme: Forward in Time, Centered in Space 


Method A first attempt to solve a PDE like (4.1) will normally be to look for a 
time-discretization scheme that is explicit so we avoid solving systems of linear 
equations. In space, we anticipate that centered differences are most accurate and 
therefore best. These two arguments lead us to a Forward Euler scheme in time and 
centered differences in space: 


[D u + vD2,u = 0J} . (4.7) 


Written out, we see that this expression implies that 


1 
n+l _ in n n 
u” =u = z0 Ui — ui i 


with C as the Courant number 
vAt 
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Implementation A solver function for our scheme goes as follows. 


import numpy as np 
import matplotlib.pyplot as plt 


def solver_FECS(I, UO, v, L, dt, C, T, user_action=None): 
Nt = int (round(T/float (dt) )) 
t = np.linspace(0, Nt*dt, Nt+1) # Mesh points in time 


dx = v*dt/C 
Nx = int (round(L/dx) ) 
x = np.linspace(0, L, Nx+1) # Mesh points in space 


# Make sure dx and dt are compatible with x and t 
dx = x[1] - x[0] 

dt = t[1] - t[0] 

C = v*dt/dx 


u = np. zeros (Nx+1) 
u_n = np.zeros(Nx+1) 


# Set initial condition u(x,0) = I(x) 
for i in range(0, Nx+1): 
unli] = I(x[i]) 


if user_action is not None: 
user_action(u_n, x, t, 0) 


for n in range(0, Nt): 
# Compute u at inner mesh points 
for i in range(1, Nx): 
uli] = u_n[i] - 0.5*C*(u_n[it+ti] - u_n[i-1]) 


# Insert boundary condition 
u[0] = UO 


if user_action is not None: 
user_action(u, x, t, nti) 


# Switch variables before next step 
Wii, BS i, Lan 


Test cases The typical solution u has the shape of J and is transported at velocity 
v to the right (if v > 0). Let us consider two different initial conditions, one smooth 
(Gaussian pulse) and one non-smooth (half-truncated cosine pulse): 


<4 Un) 
u(x,0)= Ae *\ e 7/7, (4.8) 
L L 
u(x,0) = Acos oe x—-—]), x< —elseO. (4.9) 
L 10 5 


The parameter A is the maximum value of the initial condition. 
Before doing numerical simulations, we scale the PDE problem and introduce 
X = x/Landt = vt/L, which gives 


du n ou 
ot Ox 
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The unknown vu is scaled by the maximum value of the initial condition: uv = 
u/ max |I (x)| such that |u(x,0)| € [0,1]. The scaled problem is solved by setting 
v = 1, L = 1, and A = 1. From now on we drop the bars. 

To run our test cases and plot the solution, we make the function 


def run_FECS(case): 
"u"Special function for the FECS case.""" 
if case == ’gaussian’: 
def I(x): 
return np.exp(-0.5*((x-L/10) /sigma) **2) 
elif case == ’cosinehat’: 
def I(x): 
return np.cos(np.pi*5/L*(x - L/10)) if x < L/5 else 0 


L = 1.0 
sigma = 0.02 
legends = [] 


def plotu x; t, mE 
"""Animate and plot every m steps in the same figure.""" 
plt.figure(1) 
if n == 
lines = plot(x, u) 
else: 
lines [0] .set_ydata(u) 
plt.draw() 
#plt.savefig() 
plt.figure(2) 


m = 40 
if n%m !=0: 
return 


print ’t=/g, n=/d, u in Mg, hg] w//d points’ % \ 
(tIn], n, u.min(), u.max(), x.size) 
if np.abs(u).max() > 3: # Instability? 
return 
plt.plot(x, u) 
legends .append(’t=%g’ % t[n]) 
ifm > 0: 
plt.hold(’on’) 


plt.ion() 
U0 = 0 

dt = 0.001 
c=1 
T=1 


solver(I=I, U0=U0, v=1.0, L=L, dt=dt, C=C, T=T, 
user_action=plot) 

plt.legend(legends, loc=’lower left’) 

plt.savefig(’tmp.png’); plt.savefig(’tmp.pdf’) 

ils epals (O ih, Weds, D 

plt.show() 


Bug? Running either of the test cases, the plot becomes a mess, and the printout of 
u values in the plot function reveals that u grows very quickly. We may reduce At 
and make it very small, yet the solution just grows. Such behavior points to a bug in 
the code. However, choosing a coarse mesh and performing one time step by hand 
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calculations produces the same numbers as the code, so the implementation seems 
to be correct. The hypothesis is therefore that the solution is unstable. 


4.1.2 Analysis of the Scheme 
It is easy to show that a typical Fourier component 
u(x,t) = B sin(k(x — ct)) 


is a solution of our PDE for any spatial wave length A = 27/k and any amplitude 
B. (Since the PDE to be investigated by this method is homogeneous and linear, 
B will always cancel out, so we tend to skip this amplitude, but keep it here in the 
beginning for completeness.) 

A general solution may be viewed as a collection of long and short waves with 
different amplitudes. Algebraically, the work simplifies if we introduce the complex 
Fourier component 

u(x,t) = Acei**, 


with 
Ae = BevikeAt = BewickAx 
Note that |Ae| < 1. 
It turns out that many schemes also allow a Fourier wave component as solution, 
and we can use the numerically computed values of Ae (denoted A) to learn about 


the quality of the scheme. Hence, to analyze the difference scheme we have just 
implemented, we look at how it treats the Fourier component 


un = Al eikqAx f 


Inserting the numerical component in the scheme, 
[D} ae + vD>,Aeikis* = ONG 


and making use of (A.25) results in 


, A-1 1l i 
Da (“— + "5, ) sinka») ) = o| : 


which implies 
A = 1— iC sin(kAx). 


The numerical solution features the formula A”. To find out whether A” means 
growth in time, we rewrite A in polar form: A = A,e'?, for real numbers A, and 
@, since we then have A” = A? e'?" The magnitude of A” is A”. In our case, 
A, = (1 + C?sin?(kx))'/?2 > 1, so A” will increase in time, whereas the exact 
solution will not. Regardless of At, we get unstable numerical solutions. 
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4.1.3 Leapfrog in Time, Centered Differences in Space 


Method Another explicit scheme is to do a “leapfrog” jump over 2At in time and 
combine it with central differences in space: 


[Duzu + vDoxu = Of?” 


which results in the updating formula 


n+1 _ ,,n-1 n n 
u =u; > CF ui). 
A special scheme is needed to compute u!, but we leave that problem for now. 


Anyway, this special scheme can be found in advec1D. py. 


Implementation We now need to work with three time levels and must modify our 
solver a bit: 


Nt = int(round(T/float(dt))) 
t = np.linspace(0, Nt*dt, Nt+1)  # Mesh points in time 


= np.zeros(Nx+1) 
= np.zeros(Nx+1) 
np.zeros (Nx+1) 


Ne 
Il 


for n in range(0, Nt): 
inEsehenel—— MEER 
for i in range(1, Nz): 
uli] = u_ifi] - 0.5*C*(u_1[it1] - u_1[i-1]) 
elit yscheme ==) 7 LE: 
if n == 
# Use some scheme for the first step 
for i in range(1, Nx): 


else: 
for i in range(1, Nx+1): 
uli] = u_2[i] - C*(u_i[i] - u_1[i-1]) 


# Switch variables before next step 
unul WS tal, wy, we 


Running a test case Let us try a coarse mesh such that the smooth Gaussian initial 
condition is represented by 1 at mesh node | and 0 at all other nodes. This triangular 
initial condition should then be advected to the right. Choosing scaled variables as 
At = 0.1, T = 1, and C = 1 gives the plot in Fig. 4.1, which is in fact identical to 
the exact solution (!). 


Running more test cases We can run two types of initial conditions for C = 0.8: 
one very smooth with a Gaussian function (Fig. 4.2) and one with a discontinuity in 
the first derivative (Fig. 4.3). Unless we have a very fine mesh, as in the left plots in 
the figures, we get small ripples behind the main wave, and this main wave has the 
amplitude reduced. 
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Fig. 4.2 Advection of a Gaussian function with a leapfrog scheme and C = 0.8, At = 0.001 
(left) and At = 0.01 (right) 


Advection of the Gaussian function with a leapfrog scheme, using C = 0.8 and 
At = 0.01 can be seen in a movie file!. Alternatively, with At = 0.001, we get 
this movie file”. 

Advection of the cosine hat function with a leapfrog scheme, using C = 0.8 and 
At = 0.01 can be seen in a movie file’. Alternatively, with At = 0.001, we get 
this movie file*. 


' http://tinyurl.com/gokgkov/mov-advec/gaussian/LF/C08_dt01.ogg 

? http://tinyurl.com/gokgkov/mov-advec/gaussian/LF/C08_dt001.ogg 
3 http://tinyurl.com/gokgkov/mov-advec/cosinehat/LF/C08_dt01.ogg 
4 http://tinyurl.com/gokgkov/mov-advec/cosinehat/LF/C08_dt001.ogg 
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Fig. 4.3 Advection of half a cosine function with a leapfrog scheme and C = 0.8, At = 0.001 
(left) and At = 0.01 (right) 


Analysis We can perform a Fourier analysis again. Inserting the numerical Fourier 
component in the Leapfrog scheme, we get 


A? —i2C sin(kAx)A—1=0, 


and 


A = —iC sin(kAx) + V1 — C? sin? (kAx). 


Rewriting to polar form, A = A,e'?, we see that A, = 1, so the numerical com- 
ponent is neither increasing nor decreasing in time, which is exactly what we want. 
However, for C > 1, the square root can become complex valued, so stability is 
obtained only as long as C < 1. 


Stability 
For all the working schemes to be presented in this chapter, we get the stability 


condition C < 1: 
At < a 
v 
This is called the CFL condition and applies almost always to successful schemes 
for advection problems. Of course, one can use Crank-Nicolson or Backward 
Euler schemes for increased and even unconditional stability (no Af restric- 


tions), but these have other less desired damping problems. 


We introduce p = kAx. The amplification factor now reads 


A=-iC sin p + y 1 — C? sin? p, 


and is to be compared to the exact amplification factor 


Ae = ev ikvat — et kCAx _ e iOP 
e = = = . 


Section 4.1.9 compares numerical amplification factors of many schemes with the 
exact expression. 
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Fig. 4.4 Advection of a Gaussian function with a forward in time, upwind in space scheme and 
C = 0.8, At = 0.01 (left) and At = 0.001 (right) 


4.1.4 Upwind Differences in Space 


Since the PDE reflects transport of information along with a flow in positive x 
direction, when v > 0, it could be natural to go (what is called) upstream and not 
downstream in the spatial derivative to collect information about the change of the 
function. That is, we approximate 


ðu u” — u” 
tty) & Dut = — i-l 
S- (sn) ~ [Dru] = Eo 


This is called an upwind difference (the corresponding difference in the time direc- 
tion would be called a backward difference, and we could use that name in space 
too, but upwind is the common name for a difference against the flow in advec- 
tion problems). This spatial approximation does magic compared to the scheme we 
had with Forward Euler in time and centered difference in space. With an upwind 
difference, 

[D}u + uD u = Of}, (4.10) 


written out as 


n+l __ „n n n 
w =u; —C(u; —u;_,). 


gives a generally popular and robust scheme that is stable if C < 1. As with the 
Leapfrog scheme, it becomes exact if C = 1, exactly as shown in Fig. 4.1. This 
is easy to see since C = 1 gives the property (4.6). However, any C < 1 gives a 
significant reduction in the amplitude of the solution, which is a purely numerical 
effect, see Fig. 4.4 and 4.5. Experiments show, however, that reducing At or Ax, 
while keeping C reduces the error. 

Advection of the Gaussian function with a forward in time, upwind in space 
scheme, using C = 0.8 and At = 0.01 can be seen in a movie file>. Alternatively, 
with At = 0.005, we get this movie file®. 


5 http://tinyurl.com/gokgkov/mov-advec/gaussian/UP/C08_dt001/movie.ogg 
6 http://tinyurl.com/gokgkov/mov-advec/gaussian/UP/C08_dt0005/movie.ogg 
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Fig. 4.5 Advection of half a cosine function with a forward in time, upwind in space scheme and 
C = 0.8, At = 0.001 (left) and At = 0.01 (right) 


Advection of the cosine hat function with a forward in time, upwind in space 
scheme, using C = 0.8 and At = 0.01 can be seen in a movie file’. Alternatively, 
with At = 0.001, we get this movie file®. 

The amplification factor can be computed using the formula (A.23), 


A-1 v ; 
Lae —ikAx = 0, 
m a? 


which means 
A = 1-— C(1 — cos(p) — i sin(p)). 


For C < 1 there is, unfortunately, non-physical damping of discrete Fourier com- 
ponents, giving rise to reduced amplitude of u? as in Fig. 4.4 and 4.5. The damping 
seen in these figures is quite severe. Stability requires C < 1. 


Interpretation of upwind difference as artificial diffusion 
One can interpret the upwind difference as extra, artificial diffusion in the equa- 


tion. Solving 


ðu $ ðu 3u 
v— =v 
Ot Ox Ox?’ 


by a forward difference in time and centered differences in space, 


Du + vD2u = vDyDyul}, 


actually gives the upwind scheme (4.10) if v = vAx/2. That is, solving the 
PDE u, + vu, = 0 by centered differences in space and forward difference in 
time is unsuccessful, but by adding some artificial diffusion vuxx, the method 


becomes stable: 
ou n ðu eT vAx\ u 
— +v— ={a+ —— |5. 
2 ax? 


7 http://tinyurl.com/gokgkov/mov-advec/cosinehat/UP/C08_dt01.ogg 
8 http://tinyurl.com/gokgkov/mov-advec/cosinehat/UP/C08_dt001.ogg 
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4.1.5 Periodic Boundary Conditions 


So far, we have given the value on the left boundary, uj, and used the scheme to 
propagate the solution signal through the domain. Often, we want to follow such 
signals for long time series, and periodic boundary conditions are then relevant 
since they enable a signal that leaves the right boundary to immediately enter the 
left boundary and propagate through the domain again. 

The periodic boundary condition is 


u(0,t) =u(L,t), u}= uy, . 


It means that we in the first equation, involving ug, insert u'y , and that we in the 


last equation, involving uit! insert a Normally, we can do this in the simple 
x 


way that u_1[0] is updated as u_1 [Nx] at the beginning of a new time level. 

In some schemes we may need uy, ,, and w",. Periodicity then means that 
these values are equal to uï and uy _,, respectively. For the upwind scheme, it 
is sufficient to set u_1[0]=u_1[Nx] at a new time level before computing u[1]. 
This ensures that u[1] becomes right and at the next time level u [0] at the current 
time level is correctly updated. For the Leapfrog scheme we must update u[0] and 
u [Nx] using the scheme: 


if periodic_bc: 

i=0 

uli] = u_2[i] - C*(u_1[i+1] - u_1i[Nx-1]) 
for i in range(1, Nx): 

pila) Set) = Canal [ies = (al R) 
if periodic_bc: 

u[Nx] = u[0] 


4.1.6 Implementation 


Test condition Analytically, we can show that the integral in space under the 
u(x,t) curve is constant: 
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u 
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as long as u(0) = u(L) = 0. We can therefore use the property 


L 


[unas = const 


0 


as a partial verification during the simulation. Now, any numerical method with 
C # 1 will deviate from the constant, expected value, so the integral is a measure of 
the error in the scheme. The integral can be computed by the Trapezoidal integration 
tule 


dx*(0.5*u[0] + 0.5*u[Nx] + np.sum(u[1:-1])) 


if u is an array holding the solution. 


The code An appropriate solver function for multiple schemes may go as shown 
below. 


def solver(I, U0, v, L, dt, C, T, user_action=None, 
scheme=’FE’, periodic_bc=True) : 


Nt = int (round(T/float(dt))) 
t = np.linspace(0, Nt*dt, Nt+1)  # Mesh points in time 


dx = v*dt/C 
Nx = int (round(L/dx) ) 
x = np.linspace(0, L, Nx+1) # Mesh points in space 


# Make sure dx and dt are compatible with x and t 
be = sak] = o 

dt = t[1] - t[0] 

C = v*dt/dx 

print ’dt=%g, dx=⁄g, Nx=/d, C=⁄g? % (dt, dx, Nx, C) 


u = np.zeros(Nx+1) 

u_n = np.zeros(Nx+1) 
u_nmi = np.zeros(Nx+1) 
integral = np.zeros(Nt+1) 


# Set initial condition u(x,0) = I(x) 
for i in range(0, Nx+1): 
u nli] = TCH 


# Insert boundary condition 
u[0] = UO 


# Compute the integral under the curve 
integral [0] = dx*(0.5*u_n[0] + 0.5*u_n[Nx] + np.sum(u_n[1:-1])) 


if user_action is not None: 
user_action(u_n, x, t, 0) 
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for n in range(0, Nt): 
if scheme == ’FE’; 
if periodic be: 
i=0 
uli] = u_n[i] - 0.5*C*(u_n[i+1] - u_n[Nx]) 
u[Nx] = u[0] 
for i in range(1, Nx): 
uli] = u_n[i] - 0.5*C*(u_n[it+1] - u_n[i-1]) 
elif scheme == ’LF’; 
if n == 
# Use upwind for first step 
if periodic_bc: 
i=0 
Malka] S aiea] 
for i in range(1, Nx+1): 
uli] = u_n[i] - C*(u_n[i] - u_n[i-1]) 
else: 
if periodic_bc: 
i=0 
uli] = u_nmi [i] - C*(u_n[i+i] - u_n[Nx-1]) 
for i in range(1, Nx): 
uli] = u_nmi[i] - C*(u_n[i+1] - u_n[i-1]) 
if periodic_bc: 
u[Nx] = u[0] 
elif scheme == ’UP’: 
if periodic_be: 
u_n[0] = u_n[Nx] 
for i in range(1, Nx+1): 
uli] = u_n[i] - C*(u_n[i] - u_n[i-1]) 
else: 
raise ValueError(’scheme="4s" not implemented’ % scheme) 


if not periodic_bc: 
# Insert boundary condition 
u[0] = UO 


# Compute the integral under the curve 
integral[n+1] = dx*(0.5*u[0] + 0.5*u[Nx] + np.sum(u[1:-1])) 


if user_action is not None: 
user_action(u, x, t, nt+1) 


# Switch variables before next step 
unomi un, u= uun, u; Uomi 
return integral 
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Solving a specific problem We need to call up the solver function in some kind 
of administering problem solving function that can solve specific problems and 
make appropriate visualization. The function below makes both static plots, screen 


animation, and hard copy videos in various formats. 
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def run(scheme=’UP’, case=’gaussian’, C=1, dt=0.01): 
"""General admin routine for explicit and implicit solvers.""" 


if case == ’gaussian’: 
def I(x): 
return np.exp(-0.5*((x-L/10) /sigma) **2) 
elif case == ’cosinehat’: 
def I(x): 
return np.cos(np.pi*5/L*(x - L/10)) if x < L/5 else 0 


i, aL @) 
sigma = 0.02 
global lines # needs to be saved between calls to plot 


def plot(u, x, t, n): 

"""Plot t=0 and t=0.6 in the same figure.""" 

plt.figure(1) 

global lines 

if n == 
lines = plt.plot(x, u) 
plt.axis([x[0], x[-1], -0.5, 1.5]) 
plt.xlabel(’x’); plt.ylabel(’u’) 
plt.axes() .set_aspect (0.15) 
plt.savefig(’tmp_/04d.png’ % n) 
plt.savefig(’tmp_/04d.pdf’ % n) 

else: 
lines [0] .set_ydata(u) 
plt.axis([x[0], x[-1], -0.5, 1.5]) 
plt.title(’C=%g, dt=%g, dx=%e’ % 

(C, t[1]-t[0], x[1]-x[0])) 

plt.legend([’t=%.3£’ % t[n]]) 
plt.xlabel(’x’); plt.ylabel(’u’) 
plt.draw() 
plt.savefig(’tmp_/04d.png’ % n) 

plt.figure(2) 


eps = 1E-14 
if abs(t[n] - 0.6) > eps and abs(t[n] - 0) > eps: 
return 


print ’t=/g, n=/d, u in [%g, hg] w/%⁄d points’ % \ 
(tIn], n, u.min(), u.max(), x.size) 
if np.abs(u).max() > 3: # Instability? 
return 
plt.plot(x, u) 
plt.hold(’ on’) 
plt.draw() 
it nm > 0: 
y = [I(x_-v*t[n]) for x_ in x] 
jolie arellineGe, Wy Uk") 
if abs(t[n] - 0.6) < eps: 
filename = (’tmp_/s_dt%s_C%s’ % \ 
(scheme, t[1]-t[0], C)).replace(’.’, ’’) 
np.savez(filename, x=x, u=u, u_e=y) 
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plt.ion() 
Uo = 0 
T=0.7 
=e al 


# Define video formats and libraries 
codecs = dict(flv=’flv’, mp4=’1libx264’, webm=’libvpx’, 
ogg=’ libtheora’) 
# Remove video files 
import glob, os 
for name in glob.glob(’tmp_*.png’): 
os.remove (name) 
for ext in codecs: 
name = ’movie./s’ % ext 
if os.path.isfile(name): 
os.remove (name) 


integral = solver( 
I=I, UO=U0, v=v, L=L, dt=dt, C=C, T=T, 
scheme=scheme, user_action=plot) 
# Finish up figure(2) 
plt.figure(2) 
liz.Emas lO thy —OnB, Wail) 
plt.xlabel(’$x$’); plt.ylabel(’ $u$’) 
plt.savefig(’tmp1.png’); plt.savefig(’tmp1.pdf’) 
plt.show() 
# Make videos from figure(1) animation files 
for codec in codecs: 
cmd = ’ffmpeg -i tmp_//Z04d.png -r 25 -vcodec %s movie.4s’ % \ 
(codecs[codec], codec) 
os.system(cmd) 
print ’Integral of u:’, integral.max(), integral.min() 


The complete code is found in the file advec1D. py. 


4.1.7 A Crank-Nicolson Discretization in Time and Centered 
Differences in Space 


Another obvious candidate for time discretization is the Crank-Nicolson method 
combined with centered differences in space: 


1 
[Diu]; + v5 ([Davuli*! + [Daxul}) = 0. 
It can be nice to include the Backward Euler scheme too, via the 6-rule, 


[Diu]? + vO[Do,u]}?*! + vl — 6)[Doxu]? = 0. 


When @ is different from zero, this gives rise to an implicit scheme, 


n 6 n n n 1—0 n n 
ut: + zC = u =S a Cli =u) 
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Fig. 4.6 Crank-Nicolson in time, centered in space, Gaussian profile, C = 0.8, At = 0.01 (left) 
and At = 0.005 (right) 


fori = 1,...,N,— 1. At the boundaries we set u = O and simulate just to the 
point of time when the signal hits the boundary (and gets reflected). 


ntl _ i, ntl _ 
up =Uy =O. 


The elements on the diagonal in the matrix become: 
Aij=1, 1=0,...,Ny. 
On the subdiagonal and superdiagonal we have 


0 0 ; 
Ai-i = 56 Aisi = zO i=1,...,N,—-1, 
with Aj; = 0 and Ay,-1,y, = 0 due to the known boundary conditions. And 


finally, the right-hand side becomes 


bo = wy, 

1-6 
bi =u? — Cig Mia), P= Le Ned 
by, = UG 


The dispersion relation follows from inserting u} = ae 


mula (A.25) for the spatial differences: 


and using the for- 


_ 1— (1 — 8)iC sin p 
~ 14 6iCsinp ` 


Movie 1 Crank-Nicolson in time, centered in space, C = 0.8, At = 0.005. 
https://raw.githubusercontent.com/hplgit/fdm-book/master/doc/pub/book/html/mov-advec/ 
gaussian/CN/C08_dt0005/movie.ogg 


Movie 2 Backward-Euler in time, centered in space, C = 0.8, At = 0.005. 
https://raw.githubusercontent.com/hplgit/fdm-book/master/doc/pub/book/html/mov-advec/ 
cosinehat/BE/C_08_dt005.ogg 
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x x 


Fig. 4.7 Backward-Euler in time, centered in space, half a cosine profile, C = 0.8, At = 0.01 
(left) and At = 0.005 (right) 


Figure 4.6 depicts a numerical solution for C = 0.8 and the Crank-Nicolson 
with severe oscillations behind the main wave. These oscillations are damped as the 
mesh is refined. Switching to the Backward Euler scheme removes the oscillations, 
but the amplitude is significantly reduced. One could expect that the discontinu- 
ous derivative in the initial condition of the half a cosine wave would make even 
stronger demands on producing a smooth profile, but Fig. 4.7 shows that also here, 
Backward-Euler is capable of producing a smooth profile. All in all, there are no 
major differences between the Gaussian initial condition and the half a cosine con- 
dition for any of the schemes. 


4.1.8 The Lax-Wendroff Method 


The Lax-Wendroff method is based on three ideas: 

1. Express the new unknown ue in terms of known quantities at £ = t, by means 
of a Taylor polynomial of second degree. 

2. Replace time-derivatives at £ = t, by spatial derivatives, using the PDE. 

3. Discretize the spatial derivatives by second-order differences so we achieve a 
scheme of accuracy O(A?t?) + O(Ax?’). 


Let us follow the recipe. First we have the three-term Taylor polynomial, 


du\" 1 Pu)” 
n+l yu? At At . 
"i i (Gt) +5 at? J; 


t 


From the PDE we have that temporal derivatives can be substituted by spatial 
derivatives: 


du du 
— = —-Vv— , 
dt ox 
and furthermore, 
Pu Pu 


I” ax? 
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Inserted in the Taylor polynomial formula, we get 


du\" 1 Puy” 
n+1 — u” — vAt | — -Av ee 
“i “Te (x) 5 ” \ bx2 i 


i 


To obtain second-order accuracy in space we now use central differences: 


t 


1 
utt! = u? — vAt[Do ult + 5 AP VID: Dau a 
or written out, 


1 1 
n+l, n _L nan SpR _ n n 
u; =U; zC Ui uj) + 5S (ujy1 — 2u; + uj). 


This is the explicit Lax-Wendroff scheme. 


Lax-Wendroff works because of artificial viscosity 

From the formulas above, we notice that the Lax-Wendroff method is nothing 
but a Forward Euler, central difference in space scheme, which we have shown 
to be useless because of chronic instability, plus an artificial diffusion term of 
strength FA’. It means that we can take an unstable scheme and add some 
diffusion to stabilize it. This is a common trick to deal with advection problems. 
Sometimes, the real physical diffusion is not sufficiently large to make schemes 
stable, so then we also add artificial diffusion. 


From an analysis similar to the ones carried out above, we get an amplification 
factor for the Lax-Wendroff method that equals 


A = 1 —iC sin p — 2C? sin? (p/2). 


This means that |A| = 1 and also that we have an exact solution if C = 1! 


4.1.9 Analysis of Dispersion Relations 


We have developed expressions for A(C, p) in the exact solution u} = An shane 
of the discrete equations. Note that the Fourier component that solves the original 
PDE problem has no damping and moves with constant velocity v. There are two 
basic errors in the numerical Fourier component: there may be damping and the 
wave velocity may depend on C and p = kAx. 

The shortest wavelength that can be represented is A = 2A x. The corresponding 
k is k =2n/A =n/Ax,so p = kAx € (0, 7]. 

Given a complex A as a function of C and p, how can we visualize it? The two 
key ingredients in A is the magnitude, reflecting damping or growth of the wave, 
and the angle, closely related to the velocity of the wave. The Fourier component 


D” ikc») 
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Fig. 4.8 Dispersion relations for C = 1 


has damping D and wave velocity c. Let us express our A in polar form, A = 
A,e~', and insert this expression in our discrete component uj = Argitie® — 
A” eik: 

un = Fe ale = Aree) = A” ei ED), 


for 
c= E 
kAt ` 
Now, 
CkA C 
kat = —— = =, 
v v 
so 
gu 
c=. 
Cp 
An appropriate dimensionless quantity to plot is the scaled wave velocity c/v: 
c_ ġġ 
v Cp’ 


Figures 4.8—4.13 contain dispersion curves, velocity and damping, for various 
values of C. The horizontal axis shows the dimensionless frequency p of the wave, 
while the figures to the left illustrate the error in wave velocity c/v (should ideally 
be 1 for all p), and the figures to the right display the absolute value (magnitude) of 
the damping factor A,. The curves are labeled according to the table below. 


Label Method 

FE Forward Euler in time, centered difference in space 
LF Leapfrog in time, centered difference in space 

UP Forward Euler in time, upwind difference in space 
CN Crank-Nicolson in time, centered difference in space 
LW Lax-Wendroff’s method 


BE Backward Euler in time, centered difference in space 
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Fig. 4.11 Dispersion relations for C = 0.8 


The total damping after some time T = nAt is reflected by A, (C, p)”. Since 
normally A, < 1, the damping goes like Ay! and approaches zero as At — 0. 
The only way to reduce damping is to increase C and/or the mesh resolution. 

We can learn a lot from the dispersion relation plots. For example, looking at 
the plots for C = 1, the schemes LW, UP, and LF has no amplitude reduction, but 
LF has wrong phase velocity for the shortest wave in the mesh. This wave does not 
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Fig. 4.12 Dispersion relations for C = 0.5 
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Fig. 4.13 Dispersion relations for C = 0.5 


(normally) have enough amplitude to be seen, so for all practical purposes, there 
is no damping or wrong velocity of the individual waves, so the total shape of the 
wave is also correct. For the CN scheme, see Fig. 4.6, each individual wave has its 
amplitude, but they move with different velocities, so after a while, we see some 
of these waves lagging behind. For the BE scheme, see Fig. 4.7, all the shorter 
waves are so heavily dampened that we cannot see them after a while. We see only 
the longest waves, which have slightly wrong velocity, but visible amplitudes are 
sufficiently equal to produce what looks like a smooth profile. 

Another feature was that the Leapfrog method produced oscillations, while the 
upwind scheme did not. Since the Leapfrog method does not dampen the shorter 
waves, which have wrong wave velocities of order 10 percent, we can see these 
waves as noise. The upwind scheme, however, dampens these waves. The same 
effect is also present in the Lax-Wendroff scheme, but the damping of the interme- 
diate waves is hardly present, so there is visible noise in the total signal. 

We realize that, compared to pure truncation error analysis, dispersion analysis 
sheds more light on the behavior of the computational schemes. Truncation analysis 
just says that Lax-Wendroff is better than upwind, because of the increased order in 
time, but most people would say upwind is the better one when looking at the plots. 
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4.2 One-Dimensional Stationary Advection-Diffusion Equation 


Now we pay attention to a physical process where advection (or convection) is in 
balance with diffusion: 

du d?u 
— =a—_,. 
dx dx? 
For simplicity, we assume v and « to be constant, but the extension to the variable- 
coefficient case is trivial. This equation can be viewed as the stationary limit of the 
corresponding time-dependent problem 


v (4.11) 


+v =a ; (4.12) 


Equations of the form (4.11) or (4.12) arise from transport phenomena, either 
mass or heat transport. One can also view the equations as a simple model problem 
for the Navier-Stokes equations. With the chosen boundary conditions, the dif- 
ferential equation problem models the phenomenon of a boundary layer, where the 
solution changes rapidly very close to the boundary. This is a characteristic of many 
fluid flow problems, which makes strong demands to numerical methods. The fun- 
damental numerical difficulty is related to non-physical oscillations of the solution 
(instability) if the first-derivative spatial term dominates over the second-derivative 
term. 


4.2.1 A Simple Model Problem 
We consider (4.11) on [0, L] equipped with the boundary conditions u(0) = Uo, 


u(L) = U. By scaling we can reduce the number of parameters in the problem. 
We scale x by x = x/L, and u by 


Inserted in the governing equation we get 


v(UL = Uo) du = a(U, = Uo) di 
L dx L? d x2’ 


u(0) = 0, wu) = 1. 
Dropping the bars is common. We can then simplify to 


2 
H et, u(0) = 0, u(1)= 1. (4.13) 
There are two competing effects in this equation: the advection term transports 
signals to the right, while the diffusion term transports signals to the left and the 
right. The value u(0) = 0 is transported through the domain if € is small, and 
u ~ 0 except in the vicinity of x = 1, where u(1) = 1 and the diffusion transports 
some information about u(1) = 1 to the left. For large €, diffusion dominates 
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and the u takes on the “average” value, i.e., u gets a linear variation from 0 to 1 
throughout the domain. 

It turns out that we can find an exact solution to the differential equation problem 
and also to many of its discretizations. This is one reason why this model problem 
has been so successful in designing and investigating numerical methods for mixed 
convection/advection and diffusion. The exact solution reads 


er/e — ] 


Ue(X) = el/e _ 1 * 


The forthcoming plots illustrate this function for various values of €. 


4.2.2 A Centered Finite Difference Scheme 
The most obvious idea to solve (4.13) is to apply centered differences: 
[Dazu = €D, Du); 
fori = 1,...,N,—1, with uọ = 0 and uy, = 1. Note that this is a coupled system 
of algebraic equations involving uo, . . ., UN,- 
Written out, the scheme becomes a tridiagonal system 
Aj_1iui-1 + Aiiui + Aiqiivi+i = 9, 


fori = 1,...,N,—-1 


Aoo = 1, 
1 1 
Aji = A TENG 
Aii = 265" 
Aiivi = ~ -e 
An n=l 


The right-hand side of the linear system is zero except by, = 1. 

Figure 4.14 shows reasonably accurate results with Ny = 20 and N, = 40 cells 
in x direction and a value of € = 0.1. Decreasing € to 0.01 leads to oscillatory 
solutions as depicted in Fig. 4.15. This is, unfortunately, a typical phenomenon in 
this type of problem: non-physical oscillations arise for small € unless the resolution 
N, is big enough. Exercise 4.1 develops a precise criterion: u is oscillation-free if 


Ax < 


NIN 


If we take the present model as a simplified model for a viscous boundary layer in 
real, industrial fluid flow applications, € ~ 10~° and millions of cells are required 
to resolve the boundary layer. Fortunately, this is not strictly necessary as we have 
methods in the next section to overcome the problem! 
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Fig. 4.14 Comparison of exact and numerical solution for € = 0.1 and N, = 20, 40 with centered 
differences 
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Fig. 4.15 Comparison of exact and numerical solution for € = 0.01 and N, = 20, 40 with cen- 
tered differences 


Solver 
A suitable solver for doing the experiments is presented below. 


import numpy as np 


def solver(eps, Nx, method=’centered’): 


mun 


Solver for the two point boundary value problem u’=eps*u’’, 
u(0)=0, u(1)=1. 

nun 

x = np.linspace(0, 1, Nx+1) # Mesh points in space 

# Make sure dx and dt are compatible with x and t 

dx = x[1] - x[0] 

u = np. zeros (Nx+1) 


# Representation of sparse matrix and right-hand side 
diagonal = np.zeros(Nx+1) 

lower = np.zeros(Nx) 

upper = np.zeros (Nx) 

b = np.zeros(Nx+1) 
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# Precompute sparse matrix (scipy format) 
if method == ’centered’: 
diagonal[:] = 2*eps/dx**2 
lower[:] = -1/dx - eps/dx**2 
upper[:] = 1/dx - eps/dx**2 
elif method == ’upwind’: 
diagonal[:] = 1/dx + 2*eps/dx**2 
lower[:] = 1/dx - eps/dx**2 
upper[:] = - eps/dx**2 


# Insert boundary conditions 
upper [0] = 0 
lower[-1] = 
diagonal [0] 
Ish] = ah.) 


0 
= diagonal[-1] = 1 


# Set up sparse matrix and solve 

diags = [0, -1, 1] 

import scipy.sparse 

import scipy.sparse.linalg 

A = scipy.sparse.diags( 
diagonals=[diagonal, lower, upper], 


offsets=[0, -1, 1], shape=(Nx+1, Nx+1), 


format=’csr’) 
ul:] = scipy.sparse.linalg.spsolve(A, b) 
return u, x 


4.2.3 Remedy: Upwind Finite Difference Scheme 
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The scheme can be stabilized by letting the advective transport term, which is the 
dominating term, collect its information in the flow direction, i.e., upstream or up- 


wind of the point in question. So, instead of using a centered difference 


du ~~ Mi+1— Ui-1 
dxi 2Ax 


we use the one-sided upwind difference 


du uUi — Uj-1 
x — 


dxi Ax 


’ 


in case v > 0. For v < 0 we set 


du Uj41 — Ui 
dxi Ax 


’ 


On compact operator notation form, our upwind scheme can be expressed as 


[Dyu = €D, Dyu]; 


provided v > 0 (and € > 0). 


’ 


We write out the equations and implement them as shown in the program in 


Sect. 4.2.2. The results appear in Fig. 4.16 and 4.17: no more oscillations! 
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Fig. 4.16 Comparison of exact and numerical solution for € = 0.1 and Ny = 20, 40 with upwind 
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Fig. 4.17 Comparison of exact and numerical solution for € = 0.01 and N, = 20, 40 with upwind 


difference 


We see that the upwind scheme is always stable, but it gives a thicker boundary 
layer when the centered scheme is also stable. Why the upwind scheme is always 
stable is easy to understand as soon as we undertake the mathematical analysis 
in Exercise 4.1. Moreover, the thicker layer (seemingly larger diffusion) can be 


understood by doing Exercise 4.2. 


Exact solution for this model problem 
It turns out that one can introduce a linear combination of the centered and up- 


wind differences for the first-derivative term in this model problem. One can 
then adjust the weight in the linear combination so that the numerical solution 
becomes identical to the analytical solution of the differential equation problem 


at any mesh point. 
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4.3 Time-dependent Convection-Diffusion Equations 


Now it is time to combine time-dependency, convection (advection) and diffusion 
into one equation: 


pa ee (4.14) 


Analytical insight The diffusion is now dominated by convection, a wave, and dif- 
fusion, a loss of amplitude. One possible analytical solution is a traveling Gaussian 


function 
x— vt 
t) =B — : 
u(x,t) ap( ( ia )) 


This function moves with velocity v > 0 to the right (v < 0 to the left) due to 
242 . . 
convection, but at the same time we have a damping e~!"" from diffusion. 


4.3.1 Forward in Time, Centered in Space Scheme 


The Forward Euler for the diffusion equation is a successful scheme, but it has 
a very strict stability condition. The similar Forward in time, centered in space 
strategy always gives unstable solutions for the advection PDE. What happens when 
we have both diffusion and advection present at once? 


[D,u + vDyu = aD, Dyu F fi ; 


We expect that diffusion will stabilize the scheme, but that advection will destabilize 
1t. 

Another problem is non-physical oscillations, but not growing amplitudes, due 
to centered differences in the advection term. There will hence be two types of 
instabilities to consider. Our analysis showed that pure advection with centered 
differences in space needs some artificial diffusion to become stable (and then it 
produces upwind differences for the advection term). Adding more physical diffu- 
sion should further help the numerics to stabilize the non-physical oscillations. 

The scheme is quickly implemented, but suffers from the need for small space 
and time steps, according to this reasoning. A better approach is to get rid of the 
non-physical oscillations in space by simply applying an upwind difference on the 
advection term. 


4.3.2 Forward in Time, Upwind in Space Scheme 
A good approximation for the pure advection equation is to use upwind discretiza- 
tion of the advection term. We also know that centered differences are good for the 


diffusion term, so let us combine these two discretizations: 


[Diu+vuD lu =aD,Dyu+ fI}, (4.15) 
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for v > 0. Use vD*u if v < 0. In this case the physical diffusion and the extra 
numerical diffusion vAx/2 will stabilize the solution, but give an overall too large 
reduction in amplitude compared with the exact solution. 

We may also interpret the upwind difference as artificial numerical diffusion and 
centered differences in space everywhere, so the scheme can be expressed as 


n 


A 
[Da + vD3,u = aS Dx Dau m s| , (4.16) 


l 


4.4 Applications of Advection Equations 


There are two major areas where advection and convection applications arise: trans- 
port of a substance and heat transport in a fluid. To derive the models, we may look 
at the similar derivations of diffusion models in Sect. 3.8, but change the assumption 
from a solid to fluid medium. This gives rise to the extra advection or convection 
term v - Vu. We briefly show how this is done. 

Normally, transport in a fluid is dominated by the fluid flow and not diffusion, 
so we can neglect diffusion compared to advection or convection. The end result is 
anyway an equation of the form 


4.4.1 Transport of a Substance 


The diffusion of a substance in Sect. 3.8.1 takes place in a solid medium, but in a 
fluid we can have two transport mechanisms: one by diffusion and one by advec- 
tion. The latter arises from the fact that the substance particles are moved with the 
fluid velocity v such that the effective flux now consists of two and not only one 
component as in (3.121): 

gq=-aVce+ve. 


Inserted in the equation V - g = 0 we get the extra advection term V - (vc). Very 
often we deal with incompressible flows, V - v = 0 such that the advective term 
becomes v - Vc. The mass transport equation for a substance then reads 


x bve =aV’c. (4.17) 


4.4.2 Transport of Heat in Fluids 


The derivation of the heat equation in Sect. 3.8.2 is limited to heat transport in 
solid bodies. If we turn the attention to heat transport in fluids, we get a material 
derivative of the internal energy in (3.123), 


De 
— =-V-q, 
dt q 
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and more terms if work by stresses is also included, where 


De ðe J 

dt or 
v being the velocity of the fluid. The convective term v -Ve must therefore be added 
to the governing equation, resulting typically in 


oc ($ +»vr) =V. (kVT)+ f, (4.18) 


where f is some external heating inside the medium. 


4.5 Exercises 


Exercise 4.1: Analyze 1D stationary convection-diffusion problem 
Explain the observations in the numerical experiments from Sect. 4.2.2 and 4.2.3 
by finding exact numerical solutions. 


Hint The difference equations allow solutions on the form A‘, where A is an un- 
known constant and į is a mesh point counter. There are two solutions for A, so the 
general solution is a linear combination of the two, where the constants in the linear 
combination are determined from the boundary conditions. 

Filename: twopt_BVP_analysis1. 


Exercise 4.2: Interpret upwind difference as artificial diffusion 

Consider an upwind, one-sided difference approximation to a term du/dx in a dif- 
ferential equation. Show that this formula can be expressed as a centered difference 
plus an artificial diffusion term of strength proportional to Ax. This means that 
introducing an upwind difference also means introducing extra diffusion of order 
O(Ax). 

Filename: twopt_BVP_analysis2. 
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Nonlinear Problems 


5.1 Introduction of Basic Concepts 

5.1.1 Linear Versus Nonlinear Equations 

Algebraic equations A linear, scalar, algebraic equation in x has the form 
ax+b=0, 


for arbitrary real constants a and b. The unknown is a number x. All other algebraic 
equations, e.g., x? + ax + b = 0, are nonlinear. The typical feature in a nonlinear 
algebraic equation is that the unknown appears in products with itself, like x? or 
e= ltr Hix HA H... 

We know how to solve a linear algebraic equation, x = —b/a, but there are no 
general methods for finding the exact solutions of nonlinear algebraic equations, 
except for very special cases (quadratic equations constitute a primary example). A 
nonlinear algebraic equation may have no solution, one solution, or many solutions. 
The tools for solving nonlinear algebraic equations are iterative methods, where we 
construct a series of linear equations, which we know how to solve, and hope that 
the solutions of the linear equations converge to a solution of the nonlinear equation 
we want to solve. Typical methods for nonlinear algebraic equation equations are 
Newton’s method, the Bisection method, and the Secant method. 


Differential equations The unknown in a differential equation is a function and 
not a number. In a linear differential equation, all terms involving the unknown 
function are linear in the unknown function or its derivatives. Linear here means 
that the unknown function, or a derivative of it, is multiplied by a number or a 
known function. All other differential equations are non-linear. 

The easiest way to see if an equation is nonlinear, is to spot nonlinear terms 
where the unknown function or its derivatives are multiplied by each other. For 
example, in 


u'(t) = —a(t)u(t) + b(t), 


the terms involving the unknown function u are linear: u’ contains the derivative of 
the unknown function multiplied by unity, and au contains the unknown function 
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multiplied by a known function. However, 
u(t) = u(t)(1 — u(t), 


is nonlinear because of the term —u? where the unknown function is multiplied by 


itself. Also 
ðu du Er 


ar ax 
is nonlinear because of the term wu, where the unknown function appears in a 
product with its derivative. (Note here that we use different notations for derivatives: 
u' or du/dt for a function u(t) of one variable, ou or u+ for a function of more than 
one variable.) 
Another example of a nonlinear equation is 


u” + sin(u) = 0, 
because sin(u) contains products of u, which becomes clear if we expand the func- 


tion in a Taylor series: 


1 
sin(u) =u = zu? +... 


Mathematical proof of linearity 
To really prove mathematically that some differential equation in an unknown u 
is linear, show for each term T (u) that with u = au, + bun for constants a and b, 


T (au; + bun) = aT (u1) + bT (up). 
For example, the term T (u) = (sin? t)u’ (t) is linear because 


T(auı + buz) = (sin? t) (aui (t) + bu2(t)) 
= a(sin? t)u,(t) + b (sin? t)uy(t) 
= aT (u1) + bT (u2). 


However, T (u) = sin u is nonlinear because 


T (au, + bu2) = sin(au, + bu2) Æ a sinu; + b sin uz. 


5.1.2 A Simple Model Problem 


A series of forthcoming examples will explain how to tackle nonlinear differential 
equations with various techniques. We start with the (scaled) logistic equation as 
model problem: 


u'(t) = u(t)(1 — u(t)). (5.1) 


This is a nonlinear ordinary differential equation (ODE) which will be solved by 
different strategies in the following. Depending on the chosen time discretization 
of (5.1), the mathematical problem to be solved at every time level will either be a 
linear algebraic equation or a nonlinear algebraic equation. In the former case, the 
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time discretization method transforms the nonlinear ODE into linear subproblems 
at each time level, and the solution is straightforward to find since linear algebraic 
equations are easy to solve. However, when the time discretization leads to nonlin- 
ear algebraic equations, we cannot (except in very rare cases) solve these without 
turning to approximate, iterative solution methods. 

The next subsections introduce various methods for solving nonlinear differen- 
tial equations, using (5.1) as model. We shall go through the following set of cases: 


e explicit time discretization methods (with no need to solve nonlinear algebraic 
equations) 

e implicit Backward Euler time discretization, leading to nonlinear algebraic equa- 
tions solved by 
— an exact analytical technique 
— Picard iteration based on manual linearization 
— a single Picard step 
— Newton’s method 

e implicit Crank-Nicolson time discretization and linearization via a geometric 
mean formula 


Thereafter, we compare the performance of the various approaches. Despite the 
simplicity of (5.1), the conclusions reveal typical features of the various methods in 
much more complicated nonlinear PDE problems. 


5.1.3 Linearization by Explicit Time Discretization 


Time discretization methods are divided into explicit and implicit methods. Explicit 
methods lead to a closed-form formula for finding new values of the unknowns, 
while implicit methods give a linear or nonlinear system of equations that couples 
(all) the unknowns at a new time level. Here we shall demonstrate that explicit 
methods constitute an efficient way to deal with nonlinear differential equations. 

The Forward Euler method is an explicit method. When applied to (5.1), sampled 
att = t,„, it results in 


yet — u” 
= u” 1 — u” : 
a ( ) 
which is a linear algebraic equation for the unknown value u”*! that we can easily 


solve: 
yer = u” he At u"(1 a u”) . 


In this case, the nonlinearity in the original equation poses no difficulty in the dis- 
crete algebraic equation. Any other explicit scheme in time will also give only 
linear algebraic equations to solve. For example, a typical 2nd-order Runge-Kutta 
method for (5.1) leads to the following formulas: 


u* = u + Atu” (1 — u”), 
1 
yor = u" $ Ars (u"(1 E u”) 4 u* (1 = u*))) . 


The first step is linear in the unknown u*. Then u* is known in the next step, which 
is linear in the unknown u"*! , 
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5.1.4 Exact Solution of Nonlinear Algebraic Equations 


Switching to a Backward Euler scheme for (5.1), 
u” — yr! 


Ay =u"(1—u"), (5.2) 


results in a nonlinear algebraic equation for the unknown value u”. The equation is 
of quadratic type: 
At(u")? + (1 — Atu” —u"! = 0, 


and may be solved exactly by the well-known formula for such equations. Be- 
fore we do so, however, we will introduce a shorter, and often cleaner, notation 
for nonlinear algebraic equations at a given time level. The notation is inspired by 
the natural notation (i.e., variable names) used in a program, especially in more 
advanced partial differential equation problems. The unknown in the algebraic 
equation is denoted by u, while wu“! is the value of the unknown at the previous 
time level (in general, u is the value of the unknown £ levels back in time). The 
notation will be frequently used in later sections. What is meant by u should be 
evident from the context: u may either be 1) the exact solution of the ODE/PDE 
problem, 2) the numerical approximation to the exact solution, or 3) the unknown 
solution at a certain time level. 
The quadratic equation for the unknown u” in (5.2) can, with the new notation, 
be written 
F(u) = Atu + (1 — Adu -u =0. (5.3) 


The solution is readily found to be 


1 
= nr = 2 = (1) 
w= > ( 1+ At + yA- An? —4aru ) (5.4) 


Now we encounter a fundamental challenge with nonlinear algebraic equations: 
the equation may have more than one solution. How do we pick the right solution? 
This is in general a hard problem. In the present simple case, however, we can 
analyze the roots mathematically and provide an answer. The idea is to expand the 
roots in a series in Af and truncate after the linear term since the Backward Euler 
scheme will introduce an error proportional to At anyway. Using sympy, we find 
the following Taylor series expansions of the roots: 


>>> import sympy as sym 

>>> dt, u_1, u = sym.symbols(’dt u_1 u’) 

>>> ri, r2 = sym.solve(dt*u**2 + (1-dt)*u - u_1, u) # find roots 
>>> ri 

(dt - sqrt (dt**2 + 4*dt*u_1 - 2*dt + 1) - 1)/(2*dt) 


>>> r2 
(dt + sqrt (dt**2 + 4*dt*u_1 - 2*dt + 1) - 1)/(2*dt) 
>>> print ri.series(dt, 0, 2) # 2 terms in dt, around dt=0 


-1/dt + 1 - u_i + dt*(u_i**2 - u_1) + O(dt**2) 
>>> print r2.series(dt, 0, 2) 
u_1 + dt*(-u_i**2 + u_1) + O(dt**2) 
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We see that the r1 root, corresponding to a minus sign in front of the square root in 
(5.4), behaves as 1/At and will therefore blow up as At — 0! Since we know that 
u takes on finite values, actually it is less than or equal to 1, only the r2 root is of 
relevance in this case: as At —> 0, u > u™), which is the expected result. 

For those who are not well experienced with approximating mathematical formu- 
las by series expansion, an alternative method of investigation is simply to compute 
the limits of the two roots as At — 0 and see if a limit appears unreasonable: 


>>> print ri.limit(dt, 0) 
-00 
>>> print r2.limit(dt, 0) 
wal 


5.15 Linearization 


When the time integration of an ODE results in a nonlinear algebraic equation, 
we must normally find its solution by defining a sequence of linear equations and 
hope that the solutions of these linear equations converge to the desired solution of 
the nonlinear algebraic equation. Usually, this means solving the linear equation 
repeatedly in an iterative fashion. Alternatively, the nonlinear equation can some- 
times be approximated by one linear equation, and consequently there is no need 
for iteration. 

Constructing a linear equation from a nonlinear one requires linearization of 
each nonlinear term. This can be done manually as in Picard iteration, or fully al- 
gorithmically as in Newton’s method. Examples will best illustrate how to linearize 
nonlinear problems. 


5.1.6 Picard Iteration 
Let us write (5.3) in a more compact form 
F(u) = au? + bu +c = 0, 


with a = At, b = 1 — At, and c = —u™), Let u` be an available approximation 
of the unknown u. Then we can linearize the term u? simply by writing uu. The 
resulting equation, F (u) = 0, is now linear and hence easy to solve: 


F(u) x F(u) =au-u+bu+c=0. 


Since the equation Ê = Ois only approximate, the solution u does not equal the 
exact solution ue of the exact equation F (ue) = 0, but we can hope that u is closer 
to Ue than u” is, and hence it makes sense to repeat the procedure, i.e., set u7 = u 
and solve Ê (u) = 0 again. There is no guarantee that u is closer to ue than u~, but 
this approach has proven to be effective in a wide range of applications. 

The idea of turning a nonlinear equation into a linear one by using an approx- 
imation u~ of u in nonlinear terms is a widely used approach that goes under 
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many names: fixed-point iteration, the method of successive substitutions, non- 
linear Richardson iteration, and Picard iteration. We will stick to the latter name. 

Picard iteration for solving the nonlinear equation arising from the Backward 
Euler discretization of the logistic equation can be written as 

c = 
u=———, u <u. 
au- +b 

The < symbols means assignment (we set u~ equal to the value of u). The iteration 
is started with the value of the unknown at the previous time level: u7 = u"). 

Some prefer an explicit iteration counter as superscript in the mathematical no- 
tation. Let u* be the computed approximation to the solution in iteration k. In 
iteration k + 1 we want to solve 


i : c 
alu + buř™!+c=0 => ut! = =r k=0,1,... 
auk +b 
Since we need to perform the iteration at every time level, the time level counter is 
often also included: 


n 


au” yn kt! + byt *t! _ u”! =0 => u” K+! £ l , P= 0, l,... , 
au”-:K +b 
with the start value u™? = u”! and the final converged value u” = u”* for suffi- 


ciently large k. 

However, we will normally apply a mathematical notation in our final formulas 
that is as close as possible to what we aim to write in a computer code and then it 
becomes natural to use u and u” instead of u¥*! and u* or u”**! and u”*. 


Stopping criteria The iteration method can typically be terminated when the 
change in the solution is smaller than a tolerance €,,: 


ju ca u | < Eus 
or when the residual in the equation is sufficiently small (< €,), 
|F (u)| = |au? + bu + c| < €, . 


A single Picard iteration Instead of iterating until a stopping criterion is fulfilled, 
one may iterate a specific number of times. Just one Picard iteration is popular as 
this corresponds to the intuitive idea of approximating a nonlinear term like (u”)? 
by u”-lu”. This follows from the linearization u~u” and the initial choice of u~ = 
u"—! at time level ¢,. In other words, a single Picard iteration corresponds to using 
the solution at the previous time level to linearize nonlinear terms. The resulting 


discretization becomes (using proper values for a, b, and c) 


u” — yr! 
— ~ = u"(l— n-1 : 55 
w(t =u") 6.5) 


which is a linear algebraic equation in the unknown u”, making it easy to solve for 
u” without any need for an alternative notation. 
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We shall later refer to the strategy of taking one Picard step, or equivalently, 
linearizing terms with use of the solution at the previous time step, as the Picard] 
method. It is a widely used approach in science and technology, but with some 
limitations if Aż is not sufficiently small (as will be illustrated later). 


Notice 

Equation (5.5) does not correspond to a “pure” finite difference method where 
the equation is sampled at a point and derivatives replaced by differences (be- 
cause the u”~! term on the right-hand side must then be u”). The best interpreta- 
tion of the scheme (5.5) is a Backward Euler difference combined with a single 
(perhaps insufficient) Picard iteration at each time level, with the value at the 
previous time level as start for the Picard iteration. 


5.1.7 Linearization by a Geometric Mean 


We consider now a Crank-Nicolson discretization of (5.1). This means that the time 
derivative is approximated by a centered difference, 


[Diu = u(l1 — u)y'*2 


written out as 


n+l _ n 


u u 


= yt? — (u” t3}. 5.6 
a u (u"*2) (5.6) 


in ‘ i i 
The term u”*?2 is normally approximated by an arithmetic mean, 
yits x lu Ae u”+!) 
2 ki 


such that the scheme involves the unknown function only at the time levels where 
we actually intend to compute it. The same arithmetic mean applied to the nonlinear 
term gives 


1 
(uz)? x rice j wY, 


n+l 


which is nonlinear in the unknown u”™'. However, using a geometric mean for 


lly, ; ia : . 
(u"+2)? is a way of linearizing the nonlinear term in (5.6): 
1 
(u"+3)? x u”u”t! . 


j i ; ; 1 i ; 
Using an arithmetic mean on the linear u”+z term in (5.6) and a geometric mean 
for the second term, results in a linearized equation for the unknown u”+!: 


n+l 4,n 


u u 
= _(u”" 4 u”t! 4 u”u”t!, 
At 2! ) 
which can readily be solved: 


1+ 5At 
1+ Atu” — 4At 


n+l 


u = u” 
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This scheme can be coded directly, and since there is no nonlinear algebraic equa- 
tion to iterate over, we skip the simplified notation with u for u”+! and u“ for 
u”. The technique with using a geometric average is an example of transforming a 
nonlinear algebraic equation to a linear one, without any need for iterations. 

The geometric mean approximation is often very effective for linearizing 
quadratic nonlinearities. Both the arithmetic and geometric mean approxima- 
tions have truncation errors of order Ar? and are therefore compatible with the 
truncation error O(At*) of the centered difference approximation for u’ in the 
Crank-Nicolson method. 

Applying the operator notation for the means and finite differences, the lin- 
earized Crank-Nicolson scheme for the logistic equation can be compactly ex- 
pressed as 


1 
a , phe yit2 
[Div =" +u? | ; 


Remark 

If we use an arithmetic instead of a geometric mean for the nonlinear term in 
(5.6), we end up with a nonlinear term (u”+!)?. This term can be linearized as 
u-u"*! in a Picard iteration approach and in particular as u”u”+! in a Picard1 
iteration approach. The latter gives a scheme almost identical to the one arising 
from a geometric mean (the difference in u”*! being lAtu" (ut! — u”) x 
łAt?u'u, i.e., a difference of size At?). 


5.1.8 Newton’s Method 


The Backward Euler scheme (5.2) for the logistic equation leads to a nonlinear 
algebraic equation (5.3). Now we write any nonlinear algebraic equation in the 
general and compact form 

F(u) =0. 


Newton’s method linearizes this equation by approximating F(u) by its Taylor se- 
ries expansion around a computed value u~ and keeping only the linear part: 


1 
F(u) = F(u-) + F'(w)(u—u-) + zF" u) p- 
x F(u-) + F'(u`)(u — u`) = F(u). 
The linear equation Ê (u) = 0 has the solution 


B F(u) 
F'(u7) ` 


Expressed with an iteration index in the unknown, Newton’s method takes on the 
more familiar mathematical form 


k+l _ „k Fu‘) 


FU 
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It can be shown that the error in iteration k + 1 of Newton’s method is pro- 
portional to the square of the error in iteration k, a result referred to as quadratic 
convergence. This means that for small errors the method converges very fast, and 
in particular much faster than Picard iteration and other iteration methods. (The 
proof of this result is found in most textbooks on numerical analysis.) However, the 
quadratic convergence appears only if u* is sufficiently close to the solution. Fur- 
ther away from the solution the method can easily converge very slowly or diverge. 
The reader is encouraged to do Exercise 5.3 to get a better understanding for the 
behavior of the method. 

Application of Newton’s method to the logistic equation discretized by the Back- 
ward Euler method is straightforward as we have 


F(u) =aw+bu+c, a=At,b=1-—At, c= —u, 
and then 
F'(u) = 2au +b. 
The iteration method becomes 
a(u-)? + bu- +c 


u=u + Su eb , WU <u. (5.7) 


At each time level, we start the iteration by setting u7 = u. Stopping criteria as 
listed for the Picard iteration can be used also for Newton’s method. 

An alternative mathematical form, where we write out a, b, and c, and use a time 
level counter n and an iteration counter k, takes the form 


At n,ky2 1—At n,k _ 4,n—1 
(u ) A ( : )u u , u”? = url (5.8) 
2Atun* +1—At 
for k = 0,1,.... A program implementation is much closer to (5.7) than to (5.8), 


but the latter is better aligned with the established mathematical notation used in 
the literature. 


u” +1 = yk 


5.1.9 Relaxation 


One iteration in Newton’s method or Picard iteration consists of solving a linear 
problem F (u) = 0. Sometimes convergence problems arise because the new solu- 
tion u of F (u) = O is “too far away” from the previously computed solution u~. 
A remedy is to introduce a relaxation, meaning that we first solve F (u*) = 0 for 
a suggested value u* and then we take u as a weighted mean of what we had, u7, 
and what our linearized equation F=0 suggests, u*: 


u = wu“ + (1 -ouw . 


The parameter w is known as a relaxation parameter, and a choice w < 1 may 
prevent divergent iterations. 
Relaxation in Newton’s method can be directly incorporated in the basic iteration 
formula: 
Fu) 
OF, Cae 


u =u 


(5.9) 
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5.1.10 Implementation and Experiments 


The program logistic.py contains implementations of all the methods described 
above. Below is an extract of the file showing how the Picard and Newton methods 
are implemented for a Backward Euler discretization of the logistic equation. 


def BE_logistic(u0, dt, Nt, choice=’Picard’, 
eps_r=1E-3, omega=1, max_iter=1000): 
if choice == ’Picard1’: 
choice = ’Picard’ 
max_iter = 1 


u = np.zeros(Nt+1) 
iterations = [] 


u[0] = u0 
for n in range(1, Nt+1): 
a = dt 
|) = al = Che 
@ = ilia] 
if choice == ’Picard’: 
def F(u): 
return a*u**2 + b*u + c 
u_ = uln-1] 
k=0 
while abs(F(u_)) > eps_r and k < max_iter: 
u_ = omega*(-c/(a*u_ + b)) + (1-omega) *u_ 
k += 1 
umi =u 


iterations. append(k) 
elif choice == ’Newton’: 


def F(u): 
return a*u**2 + b*u + c 


def dF(u): 
return 2*a*u + b 


u_ = ul[n-1] 

k=0 

while abs(F(u_)) > eps_r and k < max_iter: 
u_ =u_ - F(u_)/dF(_) 
k += 1 

uln] = u_ 

iterations. append(k) 

return u, iterations 


The Crank-Nicolson method utilizing a linearization based on the geometric 
mean gives a simpler algorithm: 
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def CN_logistic(u0, dt, Nt): 
u = np.zeros(Nt+1) 
u[0] = u0 
for n in range(0, Nt): 
u[n+1] = (1 + 0.5*dt)/(1 + dtxu[n] - 0.5»*dt)*u[n] 
return u 


We may run experiments with the model problem (5.1) and the different strate- 
gies for dealing with nonlinearities as described above. For a quite coarse time 
resolution, At = 0.9, use of a tolerance €, = 0.1 in the stopping criterion in- 
troduces an iteration error, especially in the Picard iterations, that is visibly much 
larger than the time discretization error due to a large At. This is illustrated by com- 
paring the upper two plots in Fig. 5.1. The one to the right has a stricter tolerance 
€ = 1073, which causes all the curves corresponding to Picard and Newton iteration 
to be on top of each other (and no changes can be visually observed by reducing €, 
further). The reason why Newton’s method does much better than Picard iteration 
in the upper left plot is that Newton’s method with one step comes far below the 
€, tolerance, while the Picard iteration needs on average 7 iterations to bring the 
residual down to €, = 107!, which gives insufficient accuracy in the solution of the 
nonlinear equation. It is obvious that the Picard! method gives significant errors in 
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Fig. 5.1 Impact of solution strategy and time step length on the solution 
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Fig. 5.2 Comparison of the number of iterations at various time levels for Picard and Newton 
iteration 


addition to the time discretization unless the time step is as small as in the lower 
right plot. 

The BE exact curve corresponds to using the exact solution of the quadratic 
equation at each time level, so this curve is only affected by the Backward Euler 
time discretization. The CN gm curve corresponds to the theoretically more accurate 
Crank-Nicolson discretization, combined with a geometric mean for linearization. 
This curve appears more accurate, especially if we take the plot in the lower right 
with a small Ar and an appropriately small €, value as the exact curve. 

When it comes to the need for iterations, Fig. 5.2 displays the number of iter- 
ations required at each time level for Newton’s method and Picard iteration. The 
smaller At is, the better starting value we have for the iteration, and the faster the 
convergence is. With At = 0.9 Picard iteration requires on average 32 iterations 
per time step, but this number is dramatically reduced as At is reduced. 

However, introducing relaxation and a parameter œw = 0.8 immediately reduces 
the average of 32 to 7, indicating that for the large At = 0.9, Picard iteration takes 
too long steps. An approximately optimal value for w in this case is 0.5, which 
results in an average of only 2 iterations! An even more dramatic impact of w 
appears when At = 1: Picard iteration does not convergence in 1000 iterations, but 
œw = 0.5 again brings the average number of iterations down to 2. 
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Remark The simple Crank-Nicolson method with a geometric mean for the 
quadratic nonlinearity gives visually more accurate solutions than the Backward 
Euler discretization. Even with a tolerance of €, = 107°, all the methods for 
treating the nonlinearities in the Backward Euler discretization give graphs that 
cannot be distinguished. So for accuracy in this problem, the time discretization 
is much more crucial than €,. Ideally, one should estimate the error in the time 
discretization, as the solution progresses, and set €, accordingly. 


5.1.11 Generalization to a General Nonlinear ODE 


Let us see how the various methods in the previous sections can be applied to the 
more generic model 
u' = f(u,t), (5.10) 


where f is a nonlinear function of u. 


Explicit time discretization Explicit ODE methods like the Forward Euler 
scheme, Runge-Kutta methods and Adams-Bashforth methods all evaluate f at 
time levels where u is already computed, so nonlinearities in f do not pose any 
difficulties. 


Backward Euler discretization Approximating u’ by a backward difference leads 
to a Backward Euler scheme, which can be written as 


F(u”) = u” — At f(u”, ta) —u" |! = 0, 


or alternatively 
F(u) =u — At flu, ta) —u™ =0. 


A simple Picard iteration, not knowing anything about the nonlinear structure of f, 
must approximate f(u,t,) by f (U7, tn): 


Ê (u) =u — At fW, t) —u™. 
The iteration starts with u7 = u“ and proceeds with repeating 
u* = At fut) tu, u=out+(l-o)u, u < u, 
until a stopping criterion is fulfilled. 


Explicit vs implicit treatment of nonlinear terms 

Evaluating f for a known u` is referred to as explicit treatment of f, while 
if f(u,t) has some structure, say f(u,t) = u°, parts of f can involve the 
unknown u, as in the manual linearization (uY u, and then the treatment of 
f is “more implicit” and “less explicit”. This terminology is inspired by time 
discretization of u’ = f(u,t), where evaluating f for known u values gives 
explicit schemes, while treating f or parts of f implicitly, makes f contribute 
to the unknown terms in the equation at the new time level. 
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Explicit treatment of f usually means stricter conditions on At to achieve 
stability of time discretization schemes. The same applies to iteration techniques 
for nonlinear algebraic equations: the “less” we linearize f (i.e., the more we 
keep of u in the original formula), the faster the convergence may be. 

We may say that f(u,7) = u? is treated explicitly if we evaluate f as (v7), 
partially implicit if we linearize as (w~)*w and fully implicit if we represent f 
by u°. (Of course, the fully implicit representation will require further lineariza- 
tion, but with f(u,t) = u? a fully implicit treatment is possible if the resulting 
quadratic equation is solved with a formula.) 

For the ODE u’ = —u? with f(u,t) = —w? and coarse time resolution 
At = 0.4, Picard iteration with (u~)*u requires 8 iterations with €, = 107° for 
the first time step, while (u7)? leads to 22 iterations. After about 10 time steps 
both approaches are down to about 2 iterations per time step, but this example 
shows a potential of treating f more implicitly. 

A trick to treat f implicitly in Picard iteration is to evaluate it as f(u~,t)u/u-. 
For a polynomial f, f(u,t) = u”, this corresponds to (u~)”"u/u7 = 
(u™)”-lu. Sometimes this more implicit treatment has no effect, as with 
f(u,t) = exp(—u) and f(u,t) = Ind + u), but with f(u,t) = sin(2(u + 1)), 
the f (u~ ,t)u/u” trick leads to 7, 9, and 11 iterations during the first three steps, 
while f(u~,t) demands 17, 21, and 20 iterations. (Experiments can be done 
with the code ODE_Picard_tricks.py.) 


Newton’s method applied to a Backward Euler discretization of u’ = f(u,t) 
requires computation of the derivative 


of 


F'(u) = 1 — At => (u, tn). 
ðu 
Starting with the solution at the previous time level, u7 = u“), we can just use the 
standard formula 
F(u- uT — At f(u7,t,) —u® 
u=u —w ( Pag L ) : (5.11) 
F' (u7) 1 — At fU, tn) 


Crank-Nicolson discretization The standard Crank-Nicolson scheme with arith- 
metic mean approximation of f takes the form 


n+l _ 


At 


n 


u u 


= EPO" teen) + SU"); 


We can write the scheme as a nonlinear algebraic equation 
Darl L eu 
F(u) =u-u — Ats SU ny) -At fu stn) = 0. (5.12) 
A Picard iteration scheme must in general employ the linearization 


^ 1 1 
Pu) =u —u® — Ate fOr tng) At FU t), 
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while Newton’s method can apply the general formula (5.11) with F(u) given in 
(5.12) and 


i. 3 
F (u) =l1- LAr Gu, test) 3 


5.1.12 Systems of ODEs 


We may write a system of ODEs 


L wO = foluo(t). 21 (0),.-sun(t).t), 


“ul = fi(uo(t), ui(t),... un (t), t), 


d 
ain) = fn(uo(t), ui (t),..., un (t), t), 


as 
u' = f(u,t), u(0) = Uo, (5.13) 


if we interpret u as a vector u = (uo(t), ui (t),...,uy(t)) and f as a vector func- 
tion with components ( fo(u, t), fi(u,t),..., fy(u,t)). 

Most solution methods for scalar ODEs, including the Forward and Backward 
Euler schemes and the Crank-Nicolson method, generalize in a straightforward way 
to systems of ODEs simply by using vector arithmetics instead of scalar arithmetics, 
which corresponds to applying the scalar scheme to each component of the system. 
For example, here is a backward difference scheme applied to each component, 


Uy —Uu 
= = folu”, tna), 
u” — yr! 
1 1 
At = filu” tn), 
uy — ui! 
=e. = fn(u", tn), 


which can be written more compactly in vector form as 


u” — u”! 
Ar = f(u”, tn). 


This is a system of algebraic equations, 


u” — At flu", ta) —u"! = 0, 
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or written out 
ug — At folu”, tr) — ut = 0, 


ut, — At fy (u”, tn) — ui) =0. 


Example We shall address the 2 x 2 ODE system for oscillations of a pendulum 
subject to gravity and air drag. The system can be written as 


o = —siné — Bolo, (5.14) 
=v, (5.15) 


where ĝ is a dimensionless parameter (this is the scaled, dimensionless version of 
the original, physical model). The unknown components of the system are the angle 
O(t) and the angular velocity w(t). We introduce up = w and u, = 0, which leads 
to 

Up = fo(u,t) = — sinu; — Buo|uol, 

ui = fi(u,t) = uo. 
A Crank-Nicolson scheme reads 


n+l 


Uy — UG . nth +b, nth 
i) T 9 = sinu} 2 uy 2 Jug al 
. (1 1 
x — sin (ze +un)}-— b70 + uput! + uil, (5.16) 
n+l n 
u —u 1 1 
1 1 n+7 n+l n 
— =u æ (ug + up). (5.17) 
At 0 zí 0 0 
This is a coupled system of two nonlinear algebraic equations in two unknowns 
uit! and ut, 


Using the notation uo and u; for the unknowns w/t! and u?*! in this system, 


writing ul? and u® for the previous values uj and uï, multiplying by Ar and 
moving the terms to the left-hand sides, gives 


1 1 
Ug — uw) + At sin (en + u) + zA + u\)|uo + u| = 0, (5.18) 
1 
ia — 5 At(uo + us) =0. (5.19) 


Obviously, we have a need for solving systems of nonlinear algebraic equations, 
which is the topic of the next section. 


5.2 Systems of Nonlinear Algebraic Equations 


Implicit time discretization methods for a system of ODEs, or a PDE, lead to sys- 
tems of nonlinear algebraic equations, written compactly as 


F(u) = 0, 
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where u is a vector of unknowns u = (uo,...,¥Uy), and F is a vector function: 
F = (fo,..., Fy). The system at the end of Sect. 5.1.12 fits this notation with 
N = 1, F(u) given by the left-hand side of (5.18), while F(u) is the left-hand 
side of (5.19). 
Sometimes the equation system has a special structure because of the underlying 
problem, e.g., 
A(u)u = b(u), 


with A(w) as an (N + 1) x (N + 1) matrix function of u and b as a vector function: 
b = (bo,..., by). 

We shall next explain how Picard iteration and Newton’s method can be applied 
to systems like F(u) = 0 and A(u)u = b(u). The exposition has a focus on 
ideas and practical computations. More theoretical considerations, including quite 
general results on convergence properties of these methods, can be found in Kelley 


[8]. 


5.2.1 Picard Iteration 


We cannot apply Picard iteration to nonlinear equations unless there is some spe- 
cial structure. For the commonly arising case A(u)u = b(u) we can linearize the 
product A(u)u to A(u—)u and b(u) as b(u—). That is, we use the most previously 
computed approximation in A and b to arrive at a linear system for u: 


A(u_)u = b(u). 
A relaxed iteration takes the form 
Atu)u* = btu), u= owu* + (l-o). 


In other words, we solve a system of nonlinear algebraic equations as a sequence of 
linear systems. 


Algorithm for relaxed Picard iteration 
Given A(u)u = b(u) and an initial guess u`, iterate until convergence: 


1. solve A(u~)u* = b(u7) with respect to u* 
2. u=ou*+(1—-@)u- 
3. w < u 


“Until convergence” means that the iteration is stopped when the change in 
the unknown, ||u — u” ||, or the residual ||A(u)u — b||, is sufficiently small, see 
Sect. 5.2.3 for more details. 


5.2.2 Newton’s Method 


The natural starting point for Newton’s method is the general nonlinear vector equa- 
tion F(u) = 0. As for a scalar equation, the idea is to approximate F around a 
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known value u~ by a linear function F, calculated from the first two terms of a 
Taylor expansion of F. In the multi-variate case these two terms become 


F(u )+ Ju )- (u-u), 
where J is the Jacobian of F , defined by 


OF; 


Jij 


So, the original nonlinear system is approximated by 

Ê(u) = F7) + J (u7) - (u — u~) = 0, 
which is linear in u and can be solved in a two-step procedure: first solve Jéu = 
—F (u`) with respect to the vector ôu and then update u = u~ + ôu. A relaxation 
parameter can easily be incorporated: 


u=o(u + ĝu) + (1 —@)u ` =u + wôu. 


Algorithm for Newton's method 
Given F(u) = 0 and an initial guess u7, iterate until convergence: 


1. solve Jéu = —F (u`) with respect to du 
2. u =u” + wu 


3.u < u 


For the special system with structure A(u)u = b(u), 
F; = >) Ai (u)ur — b; (u), 
k 


one gets 


0A; k 
up + Ajj — 
ðu; uk a 


ðb; 


— 2 
Ou; ate 


L=}, 
k 
We realize that the Jacobian needed in Newton’s method consists of A(u~) as in the 
Picard iteration plus two additional terms arising from the differentiation. Using the 
notation A’(w) for dA/du (a quantity with three indices: 0A;,/du;), and b'(u) for 
db/du (a quantity with two indices: db; /du;), we can write the linear system to be 
solved as 
(A+ A'u + b’)du = —Au +b, 


or 
(AU) + A’(u~)u7 + b'(u-))bu = —A(u-)u~ + b(u7). 


Rearranging the terms demonstrates the difference from the system solved in each 
Picard iteration: 


AU) (U + du) — bu) + (Au yu + bu) )du = 0. 
-—_ aea 


Picard system 
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Here we have inserted a parameter y such that y = 0 gives the Picard system and 
y = 1 gives the Newton system. Such a parameter can be handy in software to 
easily switch between the methods. 


Combined algorithm for Picard and Newton iteration 
Given A(u), b(u), and an initial guess u~, iterate until convergence: 


1. solve (A + y(A’(u_)u7 + b/(u)))du = —A(u-)u- + b(u_) with respect 
to du 

2. u =u +adu 

3. w < u 


y = 1 gives a Newton method while y = 0 corresponds to Picard iteration. 


5.2.3 Stopping Criteria 


Let ||- || be the standard Euclidean vector norm. Four termination criteria are much 
in use: 


Absolute change in solution: ||u —u7|| < €u 
Relative change in solution: ||u — u~|| < €,||wo||, where uo denotes the start 
value of u~ in the iteration 
e Absolute residual: || F(w)|| < € 
Relative residual: || F (u)|| < €,|| F(uo)|| 


To prevent divergent iterations to run forever, one terminates the iterations when the 
current number of iterations k exceeds a maximum value K max- 

The relative criteria are most used since they are not sensitive to the characteristic 
size of u. Nevertheless, the relative criteria can be misleading when the initial start 
value for the iteration is very close to the solution, since an unnecessary reduction 
in the error measure is enforced. In such cases the absolute criteria work better. It is 
common to combine the absolute and relative measures of the size of the residual, 
as in 


| F(u)]| < Err || F (uo) || + Era, (5.21) 


where €,- is the tolerance in the relative criterion and €,, is the tolerance in the 
absolute criterion. With a very good initial guess for the iteration (typically the 
solution of a differential equation at the previous time level), the term || F'(wo)|| is 
small and €,4 is the dominating tolerance. Otherwise, €,,-|| F(uo)|| and the relative 
criterion dominates. 

With the change in solution as criterion we can formulate a combined absolute 
and relative measure of the change in the solution: 


|5u|| < €ur||Wol] + Ena - (5.22) 


The ultimate termination criterion, combining the residual and the change in 
solution with a test on the maximum number of iterations, can be expressed as 


|F (u)|| < €rr|| FE (u0)|| + Era or ||5u|| < €ur||Yoll + Eua or k > kmax- 
(5.23) 
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5.2.4 Example: A Nonlinear ODE Model from Epidemiology 


A very simple model for the spreading of a disease, such as a flu, takes the form of 
a2 x 2 ODE system 


S' = —6S1, (5.24) 
I’ = BSI —v1, (5.25) 


where S(t) is the number of people who can get ill (susceptibles) and Z(t) is the 
number of people who are ill (infected). The constants 6 > 0 and v > 0 must be 
given along with initial conditions S (0) and /(0). 


Implicit time discretization A Crank-Nicolson scheme leads to a 2 x 2 system of 
nonlinear algebraic equations in the unknowns S”+! and J”*!: 


gut — S” B 


a 2 —p[s1]"+? x = es 4 er), (5.26) 
peti ax gn i i B v 
= SI ntz _ "+2 x L (S" I" S”+!r”+1 —-(IĮ" g”>! . 
——— = pisiy't# -vt x ESI" + y= Sa" $i") 


(5.27) 


Introducing S for S”+!, S for S”, I for J"+! and J for J”, we can rewrite the 
system as 


1 

Fs(S, I) = S- S® + zAtB(S®I® +SI)=0, (5.28) 
1 1 

F,(S,1) =I- I® — Arps I® + SI)+ ue +1)=0. (5.29) 


A Picard iteration We assume that we have approximations S~ and J~ to S and 
I, respectively. A way of linearizing the only nonlinear term SJ is to write 7~ S in 
the Fs = 0 equation and S~/J in the F; = 0 equation, which also decouples the 
equations. Solving the resulting linear equations with respect to the unknowns S 
and I gives 

soe sArpsYT 


1+ 5AtpI- 
MER acs SAtBSYIO — SAtvI 
1— ;AtBS- + 5Atv 


Before a new iteration, we must update S~ <— Sand/J~ < I. 


Newton’s method The nonlinear system (5.28)—(5.29) can be written as F(u) = 0 
with F = (Fs, Fr) and u = (S, I). The Jacobian becomes 


ð ð 1 1 
J= a5 Fs a7 Fs _ Lap At pl zAtBs 
: Š F; —iAtpI 1—4AtpS + SAtv 
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The Newton system J(u~)éu = —F (u7) to be solved in each iteration is then 
1+ ŻABI 5 AtBS~ 5S 
—SArBI- = 1— ,AtBS~ + 4 Atv ôI 
B S~ —SO + SAtB(SOTM + S717) 
© ADIO ApS OIO + S-1-) + SA +I) J 


Remark For this particular system of ODEs, explicit time integration methods 
work very well. Even a Forward Euler scheme is fine, but (as also experienced more 
generally) the 4-th order Runge-Kutta method is an excellent balance between high 
accuracy, high efficiency, and simplicity. 


5.3 Linearization at the Differential Equation Level 
The attention is now turned to nonlinear partial differential equations (PDEs) and 


application of the techniques explained above for ODEs. The model problem is a 
nonlinear diffusion equation for u(x, t): 


a = V-(a(u)Vu) + f(u), xe, te€(0,T], (5.30) 

-a = g, x €0Qn, t € (0,T], (5.31) 
n 

u = Uo, x €d2p, t € (0,T]. (5.32) 


In the present section, our aim is to discretize this problem in time and then 
present techniques for linearizing the time-discrete PDE problem “at the PDE level” 
such that we transform the nonlinear stationary PDE problem at each time level 
into a sequence of linear PDE problems, which can be solved using any method 
for linear PDEs. This strategy avoids the solution of systems of nonlinear algebraic 
equations. In Sect. 5.4 we shall take the opposite (and more common) approach: 
discretize the nonlinear problem in time and space first, and then solve the resulting 
nonlinear algebraic equations at each time level by the methods of Sect. 5.2. Very 
often, the two approaches are mathematically identical, so there is no preference 
from a computational efficiency point of view. The details of the ideas sketched 
above will hopefully become clear through the forthcoming examples. 


5.3.1 Explicit Time Integration 


The nonlinearities in the PDE are trivial to deal with if we choose an explicit time 
integration method for (5.30), such as the Forward Euler method: 


[Dou = V-(a(u)Vu) + f)”, 


or written out, 
n 


~ = V-(a(u")Vu") + f(u”), 
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which is a linear equation in the unknown u”*! with solution 
u”t! = u” + AtV - (a(u")Vu") + At f(u”). 


The disadvantage with this discretization is the strict stability criterion At < 
h?/(6 maxa) for the case f = 0 and a standard 2nd-order finite difference dis- 
cretization in 3D space with mesh cell sizes h = Ax = Ay = Az. 


5.3.2 Backward Euler Scheme and Picard Iteration 
A Backward Euler scheme for (5.30) reads 
[D7u = V-(a(u)Vu) + f)". 


Written out, 
n u”! 
Oa 

This is a nonlinear PDE for the unknown function u” (x). Such a PDE can be 
viewed as a time-independent PDE where u”~! (x) is a known function. 

We introduce a Picard iteration with k as iteration counter. A typical linearization 
of the V - (æ (u”)Vu”) term in iteration k + 1 is to use the previously computed u”* 
approximation in the diffusion coefficient: œ(u”*). The nonlinear source term is 
treated similarly: f(u”*). The unknown function u”*+! then fulfills the linear 
PDE 


= V-(a(u")Vu") + f(u”). (5.33) 


yrkti — yr! 


At 


The initial guess for the Picard iteration at this time level can be taken as the solution 
at the previous time level: u”? = u"~!. 

We can alternatively apply the implementation-friendly notation where u corre- 
sponds to the unknown we want to solve for, i.e., u'-*+1 above, and u” is the most 
recently computed value, u”* above. Moreover, u“!) denotes the unknown function 
at the previous time level, u”~' above. The PDE to be solved in a Picard iteration 
then looks like 


=V. (a(u"*)Vu"**!) ms f(u™), (5.34) 


u—u) 

ae V - (x(u )Vu) + fw). (5.35) 
At the beginning of the iteration we start with the value from the previous time 
level: u7 = u”, and after each iteration, u~ is updated to u. 


Remark on notation 

The previous derivations of the numerical scheme for time discretizations of 
PDEs have, strictly speaking, a somewhat sloppy notation, but it is much used 
and convenient to read. A more precise notation must distinguish clearly be- 
tween the exact solution of the PDE problem, here denoted ue(x,t), and the 
exact solution of the spatial problem, arising after time discretization at each time 
level, where (5.33) is an example. The latter is here represented as u” (x) and is 
an approximation to we(x, f,). Then we have another approximation u”* (x) to 
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u"(x) when solving the nonlinear PDE problem for u” by iteration methods, as 
in (5.34). 

In our notation, u is a synonym for u and u? is a synonym for u”~!, 
inspired by what are natural variable names in a code. We will usually state 
the PDE problem in terms of u and quickly redefine the symbol u to mean the 
numerical approximation, while ve is not explicitly introduced unless we need 
to talk about the exact solution and the approximate solution at the same time. 


n,k+1 


5.3.3 Backward Euler Scheme and Newton’s Method 


At time level n, we have to solve the stationary PDE (5.33). In the previous section, 
we saw how this can be done with Picard iterations. Another alternative is to apply 
the idea of Newton’s method in a clever way. Normally, Newton’s method is defined 
for systems of algebraic equations, but the idea of the method can be applied at the 
PDE level too. 


Linearization via Taylor expansions Let u”* be an approximation to the un- 
known u”. We seek a better approximation on the form 


u” =u"* + bu. (5.36) 


The idea is to insert (5.36) in (5.33), Taylor expand the nonlinearities and keep 
only the terms that are linear in ĝu (which makes (5.36) an approximation for u”). 
Then we can solve a linear PDE for the correction ĝu and use (5.36) to find a new 
approximation 

u” K+! = yk a bu 


to u”. Repeating this procedure gives a sequence u"**!, k = 0, 1, . . . that hopefully 
converges to the goal u”. 

Let us carry out all the mathematical details for the nonlinear diffusion PDE 
discretized by the Backward Euler method. Inserting (5.36) in (5.33) gives 

yk + ôu — yr! 


re =V-(a(u™* + du)V(u"* + du)) + f(u"* + bu). (5.37) 


We can Taylor expand a (u”* + ŝu) and f(u"* + ôu): 


d 
oa(u” + 8u) = a(u”®) + <u" 8u + O(6u?) x a(u™™) + a! (udu, 


fur + du) = feu) + oF (ut) + Olu?) = fu") + fusu. 


Inserting the linear approximations of a and f in (5.37) results in 


yk + bu — yn! 


aa =V. (a(u"*) Vu") 4 f(u"*) 


+V-(a(u"*)Vbu) + V - (a! (u")duVu") 
+V-(a'(u"*)éuVbu) + f(u”) su. (5.38) 
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The term a’ (u"")5uV6u is of order ŝu? and therefore omitted since we expect the 
correction ŝu to be small (Su >> ĝu?). Reorganizing the equation gives a PDE for 
du that we can write in short form as 


8F (8u; u”*) = —F(u™), 


where 
7 yk Rae yr! i i ‘ 
Fujs a —V-(a(u"" Vu") + flu"), (5.39) 
6F (6u;u"*) = — Eu + V-(a(u"*)Vbu) 
+ V - (a! (u")buVu"*) + f(u”) 8u. (5.40) 


Note that ôF is a linear function of du, and F contains only terms that are known, 
such that the PDE for ŝu is indeed linear. 


Observations 

The notational form 5F = —F resembles the Newton system Jóu = —F for 
systems of algebraic equations, with ôF as J ŝu. The unknown vector in a linear 
system of algebraic equations enters the system as a linear operator in terms of a 
matrix-vector product (J ŝu), while at the PDE level we have a linear differential 
operator instead (ôF). 


Similarity with Picard iteration We can rewrite the PDE for ôu in a slightly dif- 
ferent way too if we define u”* + du as u"*+!, 


u 
At 


u” K+! — ani 
T eS Ve (a(u™*)Vur*tly F f(u™*) 


+ V- (al (u*)6uVu*) + fu) bu. (5.41) 


Note that the first line is the same PDE as arises in the Picard iteration, while the 
remaining terms arise from the differentiations that are an inherent ingredient in 
Newton’s method. 


Implementation For coding we want to introduce u for u”, u~ for u"* and u“) 
for u"~!. The formulas for F and ôF are then more clearly written as 
u7 — ul) 
Fi) = —he — aké V if (a(u-)Vu_) + fw), (5.42) 
1 
ôF (ôu; u`) = — Aree + V-(a(u7)Vé6u) 
+ V-(a'(u-)duVu-) + f'(u-)bu. (5.43) 


The form that orders the PDE as the Picard iteration terms plus the Newton method’s 
derivative terms becomes 


u — u” 
At 


=V-(a(u )Vu) + f) 
+y(V-@'U)u-uw Vu) + fu )u-w)). (5.44) 


The Picard and full Newton versions correspond to y = 0 and y = 1, respectively. 
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Derivation with alternative notation Some may prefer to derive the linearized 
PDE for ôu using the more compact notation. We start with inserting u” = u~ + ŝu 
to get 


= y~ 
CERES =V-(a(u + ôu)V (u` + 6u)) + f(u + du). 
Taylor expanding, 


a(u~ + du) x a (u`) + a’ (u-)du, 

fur + bu) ~ fur) + fu bu, 
and inserting these expressions gives a less cluttered PDE for du: 
u` + du—u"! 


= Ve e)n) + fw) 


+V-(a(u-)Véu) + V - (a'(u-)buVu) 
+V-(a'(u-)duVbu) + f'(u-)du. 


5.3.4 Crank-Nicolson Discretization 


A Crank-Nicolson discretization of (5.30) applies a centered difference at t, +1: 


[D;u = V-(a(u)Vu) + f(u)}"*?. 


The standard technique is to apply an arithmetic average for quantities defined be- 
tween two mesh points, e.g., 


tts x Top ae gt) : 
2 


However, with nonlinear terms we have many choices of formulating an arithmetic 
mean: 


[fener = f (So +) = [rey (5.45) 
1 1 nt 5 

OPH = Uru") + fw) = [FON]. (5.46) 

[a(u)Vul"*? ~ a (ze + a) V (ze + a) = [emv], 
(5.47) 

[o(u)Vuy"*2 x Sau") 4% a(u”+t!))V (50 4 uw") — [ay var] 
(5.48) 
lau) Vu] x 5 au") Vu" tatu"tyyvu"tly = [anv |’ _ (5.49) 


A big question is whether there are significant differences in accuracy between 
taking the products of arithmetic means or taking the arithmetic mean of products. 
Exercise 5.6 investigates this question, and the answer is that the approximation is 
O(At?) in both cases. 
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5.4 1D Stationary Nonlinear Differential Equations 


Section 5.3 presented methods for linearizing time-discrete PDEs directly prior to 
discretization in space. We can alternatively carry out the discretization in space of 
the time-discrete nonlinear PDE problem and get a system of nonlinear algebraic 
equations, which can be solved by Picard iteration or Newton’s method as presented 
in Sect. 5.2. This latter approach will now be described in detail. 

We shall work with the 1D problem 


—(a(uju')! + au = f(u), x €(0,L), a(u(0))u'(0) =C, u(L)=D. 
(5.50) 
The problem (5.50) arises from the stationary limit of a diffusion equation, 


ou 
ot 


= ð (ews) —au + f(u), (5.51) 
Ox ox 


as t + oo and du/dt — 0. Alternatively, the problem (5.50) arises at each time 
level from implicit time discretization of (5.51). For example, a Backward Euler 
scheme for (5.51) leads to 


u” — u” 


i d n du” n n 
a =o (ou ) T ) — au” + f(u”). (5.52) 


Introducing u(x) for u” (x), u® for u”~!, and defining f(u) in (5.50) to be f(u) in 
(5.52) plus u”! /At, gives (5.50) with a = 1/At. 

5.4.1 Finite Difference Discretization 

The nonlinearity in the differential equation (5.50) poses no more difficulty than a 
variable coefficient, as in the term (œ (x)u’)'. We can therefore use a standard finite 


difference approach when discretizing the Laplace term with a variable coefficient: 


[-—D,aD,u+au= f]i. 


Writing this out for a uniform mesh with points x; = iAx,i = 0,..., Nx, leads to 
1 
Axe (ogis — uj) — œ;_ı (u; = ui) +au; = f (ui). (5.53) 


This equation is valid at all the mesh points i = 0,1,...,N,—1. Ati = MN, 
we have the Dirichlet condition u; = 0. The only difference from the case with 
(a(x)u’)’ and f(x) is that now œ and f are functions of u and not only of x: 
(a(w(x))u’y’ and f (u(x). 

The quantity a, , 1, evaluated between two mesh points, needs a comment. Since 
œ depends on u and u is only known at the mesh points, we need to express a; , 1 
in terms of u; and u;+1. For this purpose we use an arithmetic mean, although a 
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harmonic mean is also common in this context if œ features large jumps. There are 
two choices of arithmetic means: 


ang m a (50u Hund) = BE, (5.54) 


— Ji+ż 
(a (ui) + a(ui+1)) = [ew | (5.55) 


Q 


~ 
nN 


1 
272 


i 


Equation (5.53) with the latter approximation then looks like 


1 
— 5Ax2 ((@ (ui) + (ui) uipi — Mi) — (æ (ui) + @(U;))(U; — Ui-1)) 
ar au; = fui), 
(5.56) 
or written more compactly, 
[-D,a@* D,u+au = f];. 
At mesh point i = 0 we have the boundary condition a(u)u’ = C, which is 
discretized by 
[a(u) Do,u = Co, 
meaning 
(u ji = (5.57) 
MOE TAg ` 


The fictitious value u—;ı can be eliminated with the aid of (5.56) fori = 0. Formally, 
(5.56) should be solved with respect to u;_; and that value (for i = 0) should be 
inserted in (5.57), but it is algebraically much easier to do it the other way around. 
Alternatively, one can use a ghost cell [—Ax, 0] and update the uw_; value in the 
ghost cell according to (5.57) after every Picard or Newton iteration. Such an ap- 
proach means that we use a known u—; value in (5.56) from the previous iteration. 


5.4.2 Solution of Algebraic Equations 


The structure of the equation system The nonlinear algebraic equations (5.56) 
are of the form A(u)u = b(u) with 


Aig = spp (ott) + 20 Joris) + a, 
1 
Ayia = -zaz &Č:-1) + a(u;)), 
1 
Áii+1 = -zaz ous) + a(uj+1)), 
b; = fui). 


The matrix A(u) is tridiagonal: A; ; = Ofor j >i + land j <i—1. 
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The above expressions are valid for internal mesh points 1 < i < N, — 1. For 
i = 0 we need to express u;_, = u in terms of u; using (5.57): 


2Ax 


—— C. 5.58 
a (uo) oe 


u- = uy — 


This value must be inserted in Ag. The expression for A;;+, applies fori = 0, 
and A; ;-; does not enter the system when i = 0. 

Regarding the last equation, its form depends on whether we include the Dirich- 
let condition u(L) = D, meaning uy, = D, in the nonlinear algebraic equation 
system or not. Suppose we choose (uo, u1,..., uy, —1) aS unknowns, later re- 
ferred to as systems without Dirichlet conditions. The last equation corresponds 
toi = N, —1. It involves the boundary value u y,, which is substituted by D. If the 


unknown vector includes the boundary value, (uo, u1, ..., uy, ), later referred to as 
system including Dirichlet conditions, the equation for i = N, — 1 just involves the 
unknown u y,, and the final equation becomes uy, = D, corresponding to A;; = 1 


and b; = D fori = N,. 


Picard iteration The obvious Picard iteration scheme is to use previously com- 
puted values of u; in A(u) and b(u), as described more in detail in Sect. 5.2. 
With the notation u~ for the most recently computed value of u, we have the 
system F(u) ~% F(u) = ĄA(u`)u — b(u`), with F = (Fo, Fi,..., Fn) u = 
(U0, u1,..., um). The index m is N, if the system includes the Dirichlet condition 
as a separate equation and N, — 1 otherwise. The matrix A(u7) is tridiagonal, so 
the solution procedure is to fill a tridiagonal matrix data structure and the right- 
hand side vector with the right numbers and call a Gaussian elimination routine for 
tridiagonal linear systems. 


Mesh with two cells It helps on the understanding of the details to write out all the 
mathematics in a specific case with a small mesh, say just two cells (NV, = 2). We 
use u; for the i-th component in u~. 
The starting point is the basic expressions for the nonlinear equations at mesh 
point į = Oandi = 1: 
Ao,-1u-1 + Ao.ouo + Aoimi = bo, (5.59) 
Ajouo + Arius + Ajau2 = bi. (5.60) 


Equation (5.59) written out reads 


1 
Saga N- (u) + aluo) 


+ (@(u_1) + 2æ (uo) + a (u1))uo 


— (a (uo) + a(u1)) Ju + auo = fluo). 
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We must then replace u_, by (5.58). With Picard iteration we get 


1 
zaa- C0) + 2p) + aur) 


+ (a(u_,) + 2a (ug) + w(wy)))uo + auo 


1 
= f(uy) - auzar HOOD) + a(ug))C, 
0 
where 
a 2Ax 
Bot (us) 


Equation (5.60) contains the unknown uz for which we have a Dirichlet con- 
dition. In case we omit the condition as a separate equation, (5.60) with Picard 
iteration becomes 


1 
Sasa ( (eug) + ay) uo 


+ (a(un) + 2a (u7) + æ (u3 ))u: 
= (aU) + a(u7)) Juz +auı = f(T) 


We must now move the u2 term to the right-hand side and replace all occurrences 
of uz by D: 


1 
o (a(ug) + a (u7 ))uo 


+ Di + 2a (u7) + a(D)))wi + any 


= flur) + zzz eu) + a(D))D. 


oan 


The two equations can be written as a 2 x 2 system: 


Boo Bor uo \_ do 
Bio Bia Uy di J’ 


where 
Boo = >77 rate 1) + 2a (u3) + a(uy)) + a, (5.61) 
Bua E l (u 1) + 2a (u7) + a (u7)), (5.62) 
Bio ES l (~uz) + æ(u7)), (5.63) 
Bıı = — (a (uz) + 2a (u7) + a(D)) + a, (5.64) 


oan 
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1 
dy = f(ug) — auar CV + a(ug))C, 
0 
1 
dı = f(u) + zaa 0) +a(D))D. 


The system with the Dirichlet condition becomes 


Boo Bor O uo do 
Bio Bia Biz u |=| d |, 
0 0 1 u2 D 
with 
Bry = —— (@(uz) + 2a (u7) + a(u)) + 
= Q (u Q(U a, 
DIT QAx2 9 TA 
1 _ 
Bia = -zp e ) + a(u2))), 
dı = f(u7). 


Other entries are as in the 2 x 2 system. 


(5.65) 


(5.66) 


(5.67) 


(5.68) 
(5.69) 


Newton’s method The Jacobian must be derived in order to use Newton’s method. 
Here it means that we need to differentiate F(u) = A(u)u — b(u) with respect to 
the unknown parameters uo, u1, ..., Um (m = Ny orm = N, — 1, depending on 
whether the Dirichlet condition is included in the nonlinear system F(u) = 0 or 


not). Nonlinear equation number i has the structure 


F; = Aji—1(ui-1, Ui )Ui—1 + Aii (Ui—1, Ui, Mig) Ui + Ai ipi (Ui, Wig) Ui41—5; (ui). 


Computing the Jacobian requires careful differentiation. For example, 


9 0A; J Ou; 
Du, Ait Mia Me MiMi) = u + Aii Tu; 
ð 1 
E TA xz i-i) + 2a(u;) + a (ui41) +a) ui 
1 
T zaa €i) + 2a (u;i) + a(ui41)) +a 
1 


2Ax? 
+a. 


= (2ce'(u;)u; + a (ui) + 2a (u;) + w(uj+1)) 


5.4 1D Stationary Nonlinear Differential Equations 383 


The complete Jacobian becomes 


OF; — 0A; j-1 0A; i dAi iti ðb; 
Jii = = - 2 =u; + Ajj l i= 
iu, u a r S 
= (o (uiui + 20'u; + alui) + 2al) + Ui 41) 
1 
+a- ve spz% (Ui )ui+1 — b' (ui), 
OF; 0Aji-1 0A; i ðb; 
Jiii = = l i-1 + Aiii Uj — 
i Gi 1 Ou; = n y ðui— N ðui— 1 
= FAx a ot (uj—1)uj—1 — (@(uj_—1) + a@(u;)) + a (uiui), 
0A; ; 0A; j 0b; 
Jai = thea + Aiii H — u; — 
Ouj-1 OUj+1 OUj+1 


1 
= ay (0 (ui41)ui1 — (@(u;) + @(Ui41)) + (Ui 1)Mi)- 
2Ax 
The explicit expression for nonlinear equation number i, F; (uo, u1, . . .), arises from 
moving the f(u;) term in (5.56) to the left-hand side: 


1 
i= JN ((a(uj) + æ (uiy) Mii — Ui) — (aui) + a (ui)) (u; — Ui-1)) 
x 


+ au; — fui) =0. 
(5.70) 
At the boundary point i = 0, w_; must be replaced using the formula (5.58). 
When the Dirichlet condition at i = Nx is not a part of the equation system, the 
last equation F, = 0 form = N, — 1 involves the quantity uy,—ı which must 
be replaced by D. If uy, is treated as an unknown in the system, the last equation 
F n = Ohasm = N, and reads 


Fy,(uo,.--,UN,) = un, —-D=0. 


Similar replacement of u_; and uy, must be done in the Jacobian for the first and 
last row. When uy, is included as an unknown, the last row in the Jacobian must 
help implement the condition du, = 0, since we assume that u contains the right 
Dirichlet value at the beginning of the iteration (uy, = D), and then the Newton 
update should be zero fori = 0, i.e., buy, = 0. This also forces the right-hand side 
to be b; = 0, i = Ny. 

We have seen, and can see from the present example, that the linear system in 
Newton’s method contains all the terms present in the system that arises in the 
Picard iteration method. The extra terms in Newton’s method can be multiplied by 
a factor such that it is easy to program one linear system and set this factor to 0 or 
1 to generate the Picard or Newton system. 
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5.5 Multi-Dimensional Nonlinear PDE Problems 


The fundamental ideas in the derivation of F; and J; ; in the 1D model problem are 
easily generalized to multi-dimensional problems. Nevertheless, the expressions 
involved are slightly different, with derivatives in x replaced by V, so we present 
some examples below in detail. 


5.5.1 Finite Difference Discretization 
A typical diffusion equation 
u, = V-(a(u)Vu) + fu), 


can be discretized by (e.g.) a Backward Euler scheme, which in 2D can be written 


[Dru = DoW) Du + Dya Dyu + fo] 
i,j 

We do not dive into the details of handling boundary conditions now. Dirichlet and 
Neumann conditions are handled as in corresponding linear, variable-coefficient 
diffusion problems. 

Writing the scheme out, putting the unknown values on the left-hand side and 
known values on the right-hand side, and introducing Ax = Ay = h to save some 
writing, one gets 


n At 1 n n n n 
uij S (Seow) + (UF 4) ) 741,57 iy) 


1 
= 5 (eis) F a(u; (u; ; E uj-1,;) 


1 n n n n 
F 3 eis) + oul 4 UF jy T Uy) 


1 _ 
— 5 our ja) + au; DU — its, — Atf (uij) = upy : 


This defines a nonlinear algebraic system on the form A(u)u = b(u). 


Picard iteration The most recently computed values u~ of u” can be used in a 

and f for a Picard iteration, or equivalently, we solve A(u~)u = b(u—). The result 

is a linear system of the same type as arising from u; = V - (a(x)Vu) + f(x,t). 
The Picard iteration scheme can also be expressed in operator notation: 


[Dru = D,atu-) D,u + D,a(u-) Dyu + fan] 


n 
i,j 
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Newton’s method As always, Newton’s method is technically more involved than 
Picard iteration. We first define the nonlinear algebraic equations to be solved, drop 
the superscript n (use u for u”), and introduce u“ for u”~!: 


At 


Fij = Ui 72 


1 
(Fe + (Uj 41,;)) ipi j uij) 


1 

= z eis) Oe (tes 3) (Ui j = uij) 
1 

F zei) +a(uij+1)) (Ui j+ uij) 


1 
= 5 oui ja) + au; j) ig = uai) 
— At fuj) -u =0. 


It is convenient to work with two indices i and j in 2D finite difference discretiza- 
tions, but it complicates the derivation of the Jacobian, which then gets four indices. 
(Make sure you really understand the 1D version of this problem as treated in 
Sect. 5.4.1.) The left-hand expression of an equation F;,; = 0 is to be differen- 
tiated with respect to each of the unknowns u, s (recall that this is short notation for 
urd red, s EJ: 

OF; j 

durs i 


Ji jrs = 


The Newton system to be solved in each iteration can be written as 


5 > Ji jr sôurs = =f, iE], j E€ 1. 


red, sel, 


Given i and j, only a few r and s indices give nonzero contribution to the Jaco- 
bian since F; j; contains uj+1,;, Ui, j+1, and u; j. This means that J; j s has nonzero 
contributions only ifr = i + 1,s = j + 1, as well asr = i ands = j. The 
corresponding terms in Ji jrs are Jijia diji+ij , dijij-i Jijij+ and Jijij . 
Therefore, the left-hand side of the Newton system, a D, Ji jr surs collapses to 


Ji jrsôUrs = Ji jijôuij + Jiji-1,jôuij + Jip ina Org; + Jiji j-1ôuij-1 


+ Jiji j+1bi j+. 
The specific derivatives become 


ƏF; j 


Jiji-ij = Jua 
i-Lj 


A 
= Fr uayu —uj-1,;) + @(uj4;)(—))), 


OF; j 
Lj — 


At ; 
= ye (uj 41, )(Ui41,j = uij) = a (ui, j)), 
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OF; j 


Oui, jı 


Ji jij- = 


A 
= Soo! (wip a) Ui = ui j-1) + a (ui j-1)1)), 
OF; j 


dui j+ 


Jijij+ = 
t £ 
= A (ui jy) ijy uij) — alui j-))- 


The J; ji j entry has a few more terms and is left as an exercise. Inserting the 
most recent approximation u~ for u in the J and F formulas and then forming 
Jõu = —F gives the linear system to be solved in each Newton iteration. Boundary 
conditions will affect the formulas when any of the indices coincide with a boundary 
value of an index. 


5.5.2 Continuation Methods 


Picard iteration or Newton’s method may diverge when solving PDEs with severe 
nonlinearities. Relaxation with œ < 1 may help, but in highly nonlinear problems 
it can be necessary to introduce a continuation parameter A in the problem: A = 0 
gives a version of the problem that is easy to solve, while A = 1 is the target 
problem. The idea is then to increase A in steps, Ag = 0, Ay <--- < A, = 1, and 
use the solution from the problem with 4;—; as initial guess for the iterations in the 
problem corresponding to Aj. 

The continuation method is easiest to understand through an example. Suppose 
we intend to solve 

=V: (Vul Vu) = f, 


which is an equation modeling the flow of a non-Newtonian fluid through a channel 
or pipe. For q = 0 we have the Poisson equation (corresponding to a Newtonian 
fluid) and the problem is linear. A typical value for pseudo-plastic fluids may be 
qn = —0.8. We can introduce the continuation parameter A € [0,1] such that 
q = qn A. Let {Ar}p—o be the sequence of A values in [0, 1], with corresponding q 
values {qe }%—o- We can then solve a sequence of problems 


—V-((lVul||2Vu') = f €=0,...,n, 
where the initial guess for iterating on u‘ is the previously computed solution u⁄!. 
If a particular A, leads to convergence problems, one may try a smaller increase 
in A: A, = $(At + Ag), and repeat halving the step in A until convergence is 
reestablished. 
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5.6 Operator Splitting Methods 


Operator splitting is a natural and old idea. When a PDE or system of PDEs contains 
different terms expressing different physics, it is natural to use different numerical 
methods for different physical processes. This can optimize and simplify the overall 
solution process. The idea was especially popularized in the context of the Navier- 
Stokes equations and reaction-diffusion PDEs. Common names for the technique 
are operator splitting, fractional step methods, and split-step methods. We shall 
stick to the former name. In the context of nonlinear differential equations, operator 
splitting can be used to isolate nonlinear terms and simplify the solution methods. 

A related technique, often known as dimensional splitting or alternating direction 
implicit (ADI) methods, is to split the spatial dimensions and solve a 2D or 3D 
problem as two or three consecutive 1D problems, but this type of splitting is not to 
be further considered here. 


5.6.1 Ordinary Operator Splitting for ODEs 
Consider first an ODE where the right-hand side is split into two terms: 

u' = folu) + filu). (5.71) 
In case fo and fı are linear functions of u, fọ = au and fi = bu, we have 


u(t) = Ie+)', if u(0) = I. When going one time step of length At from f, to 
tn+1, we have 


U(tn41) = u(t, je tPA : 
This expression can be also be written as 
U(tn+41) = ui jee 
or 
u* = u(t, et", (5.72) 
U(tr41) = ute” , (5.73) 
The first step (5.72) means solving u’ = fo over a time interval At with u(t,) as 


start value. The second step (5.73) means solving u’ = fı over a time interval At 
with the value at the end of the first step as start value. That is, we progress the 
solution in two steps and solve two ODEs u’ = fo and u’ = fi. The order of the 
equations is not important. From the derivation above we see that solving w’ = fi 
prior to u’ = fo can equally well be done. 

The technique is exact if the ODEs are linear. For nonlinear ODEs it is only an 
approximate method with error At. The technique can be extended to an arbitrary 
number of steps; i.e., we may split the PDE system into any number of subsystems. 
Examples will illuminate this principle. 
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5.6.2 Strang Splitting for ODEs 


The accuracy of the splitting method in Sect. 5.6.1 can be improved from O(At) 
to O(At?) using so-called Strang splitting, where we take half a step with the fọ 
operator, a full step with the fı operator, and finally half another step with the fo 
operator. During a time interval Af the algorithm can be written as follows. 


du* 1 

a = folu“), u*(tr) =U(tr), t€ nt + za ; 
du*** KK kkk * 

[ao Ai), we" (t,) = u (ici) t € [hnb + At], 
du*™* 


=, a (oui) =u" (ti), tE E ee ar | . 
dt 2 2 
The global solution is set as u(t,41) = u** (tn41). 

There is no use in combining higher-order methods with ordinary splitting since 
the error due to splitting is O(At), but for Strang splitting it makes sense to use 
schemes of order O(A?r?). 

With the notation introduced for Strang splitting, we may express ordinary first- 
order splitting as 


d * 

S = folu“), u* (ta) = u(t), t € [thy tn + At], 
du** xk ** * 

dt = filu ), u (tn) =u (ta+1), tE linsta + At], 


with global solution set as u (tn41) = U** (tn+1). 


5.6.3 Example: Logistic Growth 


Let us split the (scaled) logistic equation 
u'=u(l— u), u(0)= 0.1, 
with solution u = (9e~’ + 1)~!, into 
uw =u-w = AW + fA), fw=u, filu)=—-w. 


We solve u’ = fo(u) and u’ = f(u) by a Forward Euler step. In addition, we add 
a method where we solve u’ = foọ(u) analytically, since the equation is actually 
u’ = u with solution e’. The software that accompanies the following methods is 
the file split_logistic.py. 


Splitting techniques Ordinary splitting takes a Forward Euler step for each of the 
ODEs according to 


u*”+! — yr 
ia = f(u”), we" = u(t), t € [tity + At), (5.74) 

yu**”+1 — yu” 
e = fiu*”), urn = went te [tn tn $ At], (5.75) 


with u(t,41) = u** "+1, 
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Strang splitting takes the form 


1 
yrnts — yen 


1 
= fo(u*"), ue" =u(t,), t € È + At], (5.76) 


1 
LAr 
yrrentl — y***n i 
= f(u), u = uta, t e [tt + At], (5.77) 
At 
xx, +1 xx nth 
u =u 7 1 1 
; = fo (ama) , u** "t3 = yrrrntl 
5At 


2 


1 
te E + Attn F ar . (5.78) 


Verbose implementation The following function computes four solutions aris- 
ing from the Forward Euler method, ordinary splitting, Strang splitting, as well as 
Strang splitting with exact treatment of u’ = fo(u): 


import numpy as np 


def solver(dt, T, f, f_0, f_1): 
nun 
Solve u’=f by the Forward Euler method and by ordinary and 
Strang splitting: f(u) = f_0O(u) + f_1(u). 
nun 
Nt = int (round(T/float (dt) )) 
t = np.linspace(0, Nt*dt, Nt+1) 
u_FE = np.zeros(len(t)) 
u_spliti = np.zeros(len(t)) # 1st-order splitting 
u_split2 = np.zeros(len(t)) # 2nd-order splitting 
u_split3 = np.zeros(len(t)) # 2nd-order splitting w/exact f_0 


# Set initial values 
u_FE[0] = 0.1 
u_spliti[0] = 
u_split2[0] = 
u_split3[0] = 


ooo 
erer 


for n in range(len(t)-1): 
# Forward Euler method 
u_FE[n+1] = u_FE[n] + dt*f(u_FE[n]) 


# --- Ordinary splitting --- 
# First step 

u_s_n = u_split1[n] 

u_s = u_s_n + dtf Osn) 

# Second step 

u_ss_n = u_s 

u_ss = u_ss_n + dt#*f_1(u_ss_n) 
u_spliti[n+1] = u_ss 


# --- Strang splitting --- 

# First step 

u_s_n = u_split2[n] 

u_s = u_s_n + dt/2.*f_O(u_s_n) 
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# Second step 

u_sss_n = u_s 

u_sss = u_sss_n + dt*f_1i(u_sss_n) 
# Third step 

u_ss_n = u_sss 

u_ss = u_ss_n + dt/2.*f_O(u_ss_n) 
u_split2[n+1] = u_ss 


# --- Strang splitting using exact integrator for u’=f_0 --- 
# First step 

u_s_n = u_split3[n] 

u_s = u_s_n*np.exp(dt/2.) # exact 

# Second step 

u_sss_n = u_s 

u_sss = u_sss_n + dt*f_1(u_sss_n) 

# Third step 

u_ss_n = u_sss 

u_ss = u_ss_n*np.exp(dt/2.) # exact 
u_split3[n+1] = u_ss 


return u_FE, u_spliti, u_split2, u_split3, t 


Compact implementation We have used quite many lines for the steps in the split- 
ting methods. Many will prefer to condense the code a bit, as done here: 


# Ordinary splitting 
u_s = u_spliti[n] + dt*f_O0(u_spliti[n]) 
u_spliti[n+1] = u_s + dt*f_1(u_s) 

# Strang splitting 
u_s = u_split2[n] + dt/2.*f_0(u_split2[n]) 

u_sss = u_s + dt*f_1(u_s) 

u_split2[n+1] = u_sss + dt/2.*f_0(u_sss) 

# Strang splitting using exact integrator for u’=f_0 
u_s = u_split3[n]*np.exp(dt/2.) # exact 


u_ss = u_s + dt*f_1(u_s) 
u_split3[n+1] = u_ss*np.exp(dt/2.) 


Results Figure 5.3 shows that the impact of splitting is significant. Interestingly, 
however, the Forward Euler method applied to the entire problem directly is much 
more accurate than any of the splitting schemes. We also see that Strang splitting is 
definitely more accurate than ordinary splitting and that it helps a bit to use an exact 
solution of u’ = fo(u). With a large time step (At = 0.2, left plot in Fig. 5.3), the 
asymptotic values are off by 20-30%. A more reasonable time step (At = 0.05, 
right plot in Fig. 5.3) gives better results, but still the asymptotic values are up to 
10 % wrong. 

As technique for solving nonlinear ODEs, we realize that the present case study 
is not particularly promising, as the Forward Euler method both linearizes the orig- 
inal problem and provides a solution that is much more accurate than any of the 
splitting techniques. In complicated multi-physics settings, on the other hand, split- 
ting may be the only feasible way to go, and sometimes you really need to apply 
different numerics to different parts of a PDE problem. But in very simple prob- 
lems, like the logistic ODE, splitting is just an inferior technique. Still, the logistic 
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Time step: 0.2 Time step: 0.05 


— no split 
— split 0.3 

— strang strang 

— strang w/exact fo 0.2 strang w/exact fy 
-- exact ==> exact 


0.1 
0 1 2 3 4 5 6 7 8 o 1 2 3 4 5 6 Fs 8 


no split 
split 


Fig.5.3 Effect of ordinary and Strang splitting for the logistic equation 


ODE is ideal for introducing all the mathematical details and for investigating the 
behavior. 


5.6.4 Reaction-Diffusion Equation 


Consider a diffusion equation coupled to chemical reactions modeled by a nonlinear 
term f(u): 


du 2 
z TE u + fu). 


This is a physical process composed of two individual processes: u is the concen- 
tration of a substance that is locally generated by a chemical reaction f(u), while 
u is spreading in space because of diffusion. There are obviously two time scales: 
one for the chemical reaction and one for diffusion. Typically, fast chemical re- 
actions require much finer time stepping than slower diffusion processes. It could 
therefore be advantageous to split the two physical effects in separate models and 
use different numerical methods for the two. 
A natural spitting in the present case is 


0 x 
— = o V?u*, (5.79) 
ou** 
a = Sut). (5.80) 


Looking at these familiar problems, we may apply a 0 rule (implicit) scheme for 
(5.79) over one time step and avoid dealing with nonlinearities by applying an ex- 
plicit scheme for (5.80) over the same time step. 

Suppose we have some solution u at time level ¢,. For flexibility, we define a 0 
method for the diffusion part (5.79) by 


Dat = a(D;Dyu* + Dy Daw yl. 


We use u” as initial condition for u*. 
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The reaction part, which is defined at each mesh point (without coupling values 
in different mesh points), can employ any scheme for an ODE. Here we use an 
Adams-Bashforth method of order 2. Recall that the overall accuracy of the splitting 
method is maximum O (Az?) for Strang splitting, otherwise it is just O (At). Higher- 
order methods for ODEs will therefore be a waste of work. The 2nd-order Adams- 
Bashforth method reads 

** n+l 4K 1 kN **,n— l 
upt = ugy" + SAL (fu a= a): (5.81) 
We can use a Forward Euler step to start the method, i.e, compute ue 
The algorithm goes like this: 


pà 


. Solve the diffusion problem for one time step as usual. 

2. Solve the reaction ODEs at each mesh point in [4,,¢, + At], using the diffusion 
solution in 1. as initial condition. The solution of the ODEs constitutes the 
solution of the original problem at the end of each time step. 


We may use a much smaller time step when solving the reaction part, adapted to 
the dynamics of the problem u’ = f(u). This gives great flexibility in splitting 
methods. 


5.6.5 Example: Reaction-Diffusion with Linear Reaction Term 


The methods above may be explored in detail through a specific computational 
example in which we compute the convergence rates associated with four different 
solution approaches for the reaction-diffusion equation with a linear reaction term, 
i.e. f(u) = —bu. The methods comprise solving without splitting (just straight 
Forward Euler), ordinary splitting, first order Strang splitting, and second order 
Strang splitting. In all four methods, a standard centered difference approximation 
is used for the spatial second derivative. The methods share the error model E = 
Ch", while differing in the step h (being either Ax? or Ax) and the convergence 
rate r (being either 1 or 2). 

All code commented below is found in the file split_diffu_react.py. When 
executed, a function convergence_rates is called, from which all convergence 
rate computations are handled: 


def convergence_rates(scheme=’ diffusion’): 
F=0.5 # Upper limit for FE (stability). For CN, this 


# limit does not apply, but for simplicity, we 
# choose F = 0.5 as the initial F value. 


Lap el ter 8 TI 
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def exact(x, t): 
>? exact sol. to: du/dt = atd 2u/dx 2 - Dru? 
return np.exp(-(a*k**2 + b)*t) * np.sin(k*x) 


def f(u, t): 
return -b*u 


def I(x): 
return exact(x, 0) 


global error # error computed in the user action function 
error = 0 


# Convergence study 
def action(u, x, t, n): 
global error 
if n == 1: # New simulation, - reset error 
error (0) 
else: 
error = max(error, np.abs(u - exact(x, t[n])).max()) 


= (hl 
m= E] 
Nx_values = [10, 20, 40, 80] # i.e., dx halved each time 
for Nx in Nx_values: 
dx = L/Nx 
if scheme == ’Strang_splitting_2ndOrder’: 
print ’Strang splitting with 2nd order schemes...’ 
# In this case, E = C*h**r (with r = 2) and since 
# h = dx = K*dt, the ratio dt/dx must be constant. 
# To fulfill this demand, we must let F change 
# when dx changes. From F = a*dt/dx**2, it follows 
# that halving dx AND doubling F assures dt/dx const. 
# Initially, we simply choose F = 0.5. 


dt = F/a*xdx**2 

#print ’dt/dx:’, dt/dx 

Nt = int (round(T/float (dt) )) 

t = np.linspace(0O, Nt*dt, Nt+1)  # global time 

Strang_splitting 2ndO0rder(I=I, a=a, b=b, f=f, L=L, dt=dt, 
dt_Rfactor=1, F=F, t=t, T=T, 
user_action=action) 

h. append (dx) 

# prepare for next iteration (make F match dx/2) 

F = F*2 # assures dt/dx const. when dx = dx/2 

else: 

# In these cases, E = C*h**r (with r = 1) and since 

# h = dx**2 = K*dt, the ratio dt/dx**2 must be constant. 

# This is fulfilled by choosing F = 0.5 (for FE stability) 

# and make sure that F, dx and dt comply to F = ax*dt/dx**2. 

dt = F/a*xdx**2 

Nt = int (round(T/float(dt))) 

t = np.linspace(0, Nt*dt, Ntt+1)  # global time 
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if scheme == ’diffusion’: 
print ’FE on whole eqn...’ 
dittusionttheta i iy 85 Ihs Chas My iy We 
step_no=0, theta=0, 
u_L=0, u_R=0, user_action=action) 
h. append (dx**2) 
elif scheme == ’ordinary_splitting’: 
print ’Ordinary splitting...’ 
ordinary_splitting(I=I, a=a, b=b, f=f, L=L, dt=dt, 
dt_Rfactor=1, F=F, t=t, T=T, 
user_action=action) 
h. append (dx**2) 
elif scheme == ’Strang_splitting_1stOrder’: 
print ’Strang splitting with ist order schemes...’ 
Strang_splitting_1stOrder(I=I, a=a, b=b, f=f, L=L, dt=dt, 
dt_Rfactor=1, F=F, t=t, T=T, 
user_action=action) 
h. append (dx**2) 
else: 
print ’Unknown scheme requested!’ 
sys.exit (0) 


#print ’dt/dx**2:’, dt/dx**2 


E. append (error) 
Nx *= 2 # Nx doubled gives dx/2 


print ’E:’, E 
print “7035 Jay 


# Convergence rates 
= [np.log(ELi] /E[i-1])/np.1log(h[i] /h[i-1]) 
for i in range(1,len(Nx_values) )] 
print ’Computed rates:’, r 


if __name == 7 maine 2: 


schemes = [’diffusion’, 
?ordinary_splitting’, 
*Strang_splitting_istOrder’, 
?Strang splitting _2nd0rder’] 


for scheme in schemes: 
convergence_rates (scheme=scheme) 


Now, with respect to the error (E = Ch’), the Forward Euler scheme, the or- 
dinary splitting scheme and first order Strang splitting scheme are all first order 


(r = 1), witha apep h = Ax? = K~'At, where K is some constant. This implies 
that the ratio <> must be held constant during convergence rate calculations. Fur- 
thermore, the ee number F = gA, j limited to F = 0.5, being the 


stability limit with explicit schemes. Thus, in these cases, we use the fixed value 
of F and a given (but changing) spatial resolution Ax to compute the core pono 
ing value of Ar according to the expression for F. This assures that <> is kept 
constant. The loop in convergence_rates runs over a chosen set of d points 
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(Nx_values) which gives a doubling of spatial resolution with each iteration (Ax 
is halved). 

For the second order Strang splitting scheme, we have r = 2 anda steph = 
Ax = K~'At, where K again is some constant. In this case, it is thus the ratio 
at that must be held constant during the convergence rate calculations. From the 
expression for F, it is clear then that F must change with each halving of Ax. In 
fact, if F is doubled each time Ax is halved, the ratio AL will be constant (this 
follows, e.g., from the expression for F). This is utilized in our code. 

A solver diffusion_theta is used in each of the four solution approaches: 


def diffusion_theta(I, a, f, L, dt, F, t, T, step_no, theta=0.5, 
u_L=0, u_R=0, user_action=None) : 

won 
Full solver for the model problem using the theta-rule 
difference approximation in time (no restriction on F, 
i.e., the time step when theta >= 0.5). Vectorized 
implementation and sparse (tridiagonal) coefficient matrix. 
Note that t always covers the whole global time interval, whether 
splitting is the case or not. T, on the other hand, is 
the end of the global time interval if there is no split, 
but if splitting, we use T=dt. When splitting, step_no 
keeps track of the time step number (for lookup in t). 


Nt = int (round(T/float (dt) )) 

dx = np.sqrt (a*dt/F) 

Nx = int (round(L/dx)) 

x = np.linspace(0, L, Nx+1) # Mesh points in space 


# Make sure dx and dt are compatible with x and t 
dx = x[1] - x[0] 
dt = t[1] - t[0] 


u 
ui 


np.zeros(Nx+1) # solution array at t[nt1] 
np.zeros(Nx+1) # solution at t[n] 


# Representation of sparse matrix and right-hand side 
diagonal = np.zeros(Nx+1) 


lower = np.zeros (Nx) 
upper = np.zeros (Nx) 
b = np.zeros(Nxt+1) 


# Precompute sparse matrix (scipy format) 
Fl = F*theta 

Fr = F*(1-theta) 

diagonal[:] = 1 + 2*F1 
lower[:] = -F1 #1 

upper[:] = -F1 #1 

# Insert boundary conditions 
diagonal[0] = 1 

upper [0] = 0 

diagonal [Nx] = 1 

lower[-1] = 0 
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diags = [0, -1, 1] 

A = scipy.sparse.diags( 
diagonals=[diagonal, lower, upper], 
offsets=[0, -1, 1], shape=(Nxt+1, Nx+1), 
format=’csr’) 

#print A.todense() 


# Allow f to be None or 0 
alge a als} Nerney Gia a == 
f = lambda x, t: np.zeros((x.size)) \ 
if isinstance(x, np.ndarray) else 0 


# Set initial condition 
if isinstance(I, np.ndarray): # I is an array 
u 1 = np.copy(I) 
else: # I is a function 
for i in range(0O, Nx+1): 
u_1f{i] = I(xfi]) 


if user_action is not None: 
user_action(u_1, x, t, step_not0O) 


# Time loop 
for n in range(0, Nt): 
jolie] = walle) o> \ 
Erea |be—Pal) = Pan E se teal i))) ae \ 
dt*theta*f(u_1[1:-1], t[step_notn+1]) + \ 
dt*(1-theta)*f(u_1[1:-1], t[step_no+n]) 
u_L; b[-1] = u_R # boundary conditions 
scipy.sparse.linalg.spsolve(A, b) 


b[0] 
ims | 


if user_action is not None: 
user_action(u, x, t, step_not(n+1)) 


# Update u_1 before next step 
uiteen u 


# u is now contained in u_1 (swapping) 
return u_1 


For the no splitting approach with Forward Euler in time, this solver handles both 
the diffusion and the reaction term. When splitting, diffusion_theta takes care 
of the diffusion term only, while the reaction term is handled either by a Forward 
Euler scheme in reaction_FE, or by a second order Adams-Bashforth scheme 
from Odespy. The reaction_FE function covers one complete time step dt during 
ordinary splitting, while Strang splitting (both first and second order) applies it with 
dt/2 twice during each time step dt. Since the reaction term typically represents a 
much faster process than the diffusion term, a further refinement of the time step is 
made possible in reaction_FE. It was implemented as 
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def reaction_FE(I, f, L, Nx, dt, dt_Rfactor, t, step_no, 
user_action=None) : 
"""Reaction solver, Forward Euler method. 
Note the at t covers the whole global time interval. 
dt is either one complete,or one half, of the step in the 
diffusion part, i.e. there is a local time interval 
[0, dt] or [0, dt/2] that the reaction_FE 
deals with each time it is called. step_no keeps 
track of the (global) time step number (required 
for lookup in t). 


u = np.copy(I) 

dt_local = dt/float(dt_Rfactor) 

Nt_local = int (round (dt/float (dt_local))) 
x = np.linspace(0, L, Nx+1) 


for n in range(Nt_local): 
time = t[step_no] + n*dt_local 
u[1:Nx] = u[1:Nx] + dt_local*f(u[1i:Nx], time) 


# BC already inserted in diffusion step, i.e. no action here 
return u 


With the ordinary splitting approach, each time step dt is covered twice. First 
computing the impact of the reaction term, then the contribution from the diffusion 
term: 


def ordinary_splitting(I, a, b, f, L, dt, 
de Rfactor, I. Ge, È; 
user_action=None) : 
??? 1st order scheme, i.e. Forward Euler is enough for both 
the diffusion and the reaction part. The time step dt is 
given for the diffusion step, while the time step for the 
reaction part is found as dt/dt_Rfactor, where dt_Rfactor >= 1. 
Pl Py | 
Nt = int (round(T/float (dt) )) 
dx = np.sqrt (a*dt/F) 
Nx = int (round(L/dx) ) 
x = np.linspace(0, L, Nx+1) # Mesh points in space 
u = np.zeros(Nx+1) 


# Set initial condition u(x,0) = I(x) 
for i in range(0O, Nx+1): 


uli] = I(x[i]) 


# In the following loop, each time step is "covered twice", 
# first for reaction, then for diffusion 
for n in range(0, Nt): 
# Reaction step (potentially many smaller steps within dt) 
u_s = reaction_FE(I=u, f=f, L=L, Nx=Nx, 
dt=dt, dt_Rfactor=dt_Rfactor, 
t=t, step_no=n, 
user_action=None) 
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u = diffusion_theta(I=u_s, a=a, f=0, L=L, dt=dt, F=F, 
t=t, T=dt, step_no=n, theta=0, 
u_L=0, u_R=0, user_action=None) 


if user_action is not None: 
user_action(u, x, t, n+1) 


return 


For the two Strang splitting approaches, each time step dt is handled by first 
computing the reaction step for (the first) dt/2, followed by a diffusion step dt, 
before the reaction step is treated once again for (the remaining) dt/2. Since first 
order Strang splitting is no better than first order accurate, both the reaction and 
diffusion steps are computed explicitly. The solver was implemented as 


def Strang_splitting_1stOrder(I, a, b, f, L, dt, dt_Rfactor, 
F, t, T, user_action=None) : 
>? Strang splitting while still using FE for the reaction 
step and for the diffusion step. Gives ist order scheme. 
The time step dt is given for the diffusion step, while 
the time step for the reaction part is found as 
0.5*dt/dt_Rfactor, where dt_Rfactor >= 1. Introduce an 


extra time mesh t2 for the reaction part, since it steps dt/2. 
i ber ter f 


Nt = int (round(T/float (dt) )) 
t2 = np.linspace(0, Nt*dt, (Nt+1)+Nt)  # Mesh points in diff 
dx = np.sqrt(a*dt/F) 


Nx = int (round(L/dx) ) 
x = np.linspace(0, L, Nx+1) 
u = np.zeros(Nx+1) 


# Set initial condition u(x,0) = I(x) 
for i in range(0, Nx+1): 


uli] = I(x[i]) 


for n in range(0, Nt): 
# Reaction step (1/2 dt: from t_n to t_nt1/2) 
# (potentially many smaller steps within dt/2) 
u_s = reaction_FE(I=u, f=f, L=L, Nx=Nx, 
dt=dt/2.0, dt_Rfactor=dt_Rfactor, 
t=t2, step_no=2*n, 
user_action=None) 
# Diffusion step (1 dt: from t_n tot nti) 
u_sss = diffusion_theta(I=u_s, a=a, f=0, L=L, dt=dt, F=F, 
t=t, T=dt, step_no=n, theta=0, 
u_L=0, u_R=0, user_action=None) 
Reaction step (1/2 dt: from t_n+1/2 to t_nt1) 
(potentially many smaller steps within dt/2) 
= reaction_FE(I=u_sss, f=f, L=L, Nx=Nx, 
dt=dt/2.0, dt_Rfactor=dt_Rfactor, 
t=t2, step_no=2*n+1, 
user_action=None) 


# 
# 
u 


if user_action is not None: 
user_action(u, x, t, n+1) 


return 
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The second order version of the Strang splitting approach utilizes a second order 
Adams-Bashforth solver for the reaction part and a Crank-Nicolson scheme for the 
diffusion part. The solver has the same structure as the one for first order Strang 
splitting and was implemented as 


def Strang_splitting 2ndOrder(I, a, b, f, L, dt, dt_Rfactor, 
F, t, T, user_action=None): 
>? Strang splitting using Crank-Nicolson for the diffusion 
step (theta-rule) and Adams-Bashforth 2 for the reaction step. 
Gives 2nd order scheme. Introduce an extra time mesh t2 for 
the reaction part, since it steps dt/2. 
233 
import odespy 
Nt = int (round(T/float (dt) )) 
t2 = np.linspace(0, Nt*dt, (Nt+1)+Nt)  # Mesh points in diff 
dx = np.sqrt(a*dt/F) 
Nx = int (round(L/dx) ) 
x = np.linspace(0, L, Nx+1) 
u = np.zeros(Nx+1) 


# Set initial condition u(x,0) = I(x) 
for i in range(0, Nx+1): 


uli] = IG@[il) 
reaction_solver = odespy.AdamsBashforth2(f) 


for n in range(0, Nt): 
# Reaction step (1/2 dt: from t_n to t_nt1/2) 
# (potentially many smaller steps within dt/2) 
reaction_solver.set_initial_condition(u) 
t_points = np.linspace(0, dt/2.0, dt_Rfactor+1) 
u_AB2, t_ = reaction_solver.solve(t_points) # t_ not needed 
u_s = u_AB2[-1,:] # pick sol at last point in time 


# Diffusion step (1 dt: from t n to t nti) 

u_sss = diffusion_theta(I=u_s, a=a, f=0, L=L, dt=dt, F=F, 
t=t, T=dt, step_no=n, theta=0.5, 
u_L=0, u_R=0, user_action=None) 

# Reaction step (1/2 dt: from t_n+1/2 to t_n+1) 

# (potentially many smaller steps within dt/2) 

reaction_solver.set_initial_condition(u_sss) 

t_points = np.linspace(0, dt/2.0, dt_Rfactor+1) 

u_AB2, t_ = reaction_solver.solve(t_points) # t_ not needed 

u = u_AB2[-1,:]  # pick sol at last point in time 


if user_action is not None: 
user_action(u, x, t, n+1) 


return 


When executing split_diffu_react.py, we find that the estimated conver- 
gence rates are as expected. The second order Strang splitting gives the least error 
(about 4e-°) and has second order convergence (r = 2), while the remaining three 
approaches have first order convergence (r = 1). 
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5.6.6 Analysis of the Splitting Method 


Let us address a linear PDE problem for which we can develop analytical solutions 
of the discrete equations, with and without splitting, and discuss these. Choosing 
f(u) = —ßpu for a constant 6 gives a linear problem. We use the Forward Euler 
method for both the PDE and ODE problems. 

We seek a 1D Fourier wave component solution of the problem, assuming ho- 
mogeneous Dirichlet conditions at x = 0 and x = L: 


Qe R TT 
u =e tt Bt sinkx, k=7. 


This component fits the 1D PDE problem (f = 0). On complex form we can write 


ay Pe hey 
u=e ak*t pete 


where i = /—1 and the imaginary part is taken as the physical solution. 

We refer to Sect. 3.3 and to the book [9] for a discussion of exact numerical 
solutions to diffusion and decay problems, respectively. The key idea is to search for 
solutions A”e’** and determine A. For the diffusion problem solved by a Forward 
Euler method one has 

A=1-4Fsin’, 


where F = aAt/Ax? is the mesh Fourier number and p = kAx/2 is a dimen- 
sionless number reflecting the spatial resolution (number of points per wave length 
in space). For the decay problem u’ = —Bu, we have A = 1 — q, where q isa 
dimensionless parameter for the resolution in the decay problem: q = BAt. 

The original model problem can also be discretized by a Forward Euler scheme, 


[Diu =aD,D,u— pul’. 
Assuming A”e'' we find that 
u; = (1—4F sin? —q)" sin kx . 
We are particularly interested in what happens at one time step. That is, 
u; = (1-4F sin? pu. 
In the two stage algorithm, we first compute the diffusion step 
ue"tl = (1 — 4F sin’ pu". 
Then we use this as input to the decay algorithm and arrive at 
unt! = (1 — q)u*” +! = (1 —q)(1 — 4F sin? p)u”™! . 
The splitting approximation over one step is therefore 


E = 1 — 4F sin’ —q — (1 — q)(1 — 4F sin’ p) = —q(2 — F sin’ p). 
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5.7 Exercises 
Problem 5.1: Determine if equations are nonlinear or not 


Classify each term in the following equations as linear or nonlinear. Assume that u, 
u, and p are unknown functions and that all other symbols are known quantities. 


1. mu" + Blu’'|u'+ cu = F(t) 

2. U, = AUyy 

3. uy = Vu 

4. u, = V-(a(u)Vu) + f(x,y) 

5. u, + f(u) = 0 

6. u, t}u-Vu =—Vp4+rv7u, V -u = 0 (wis a vector field) 
7. uw = f(u,t) 

8. Vu = Ae 


Filename: nonlinear_vs_linear. 


Problem 5.2: Derive and investigate a generalized logistic model 
The logistic model for population growth is derived by assuming a nonlinear growth 
rate, 

u' =a(u)u, u(0)= I, (5.82) 


and the logistic model arises from the simplest possible choice of a(u): r(u) = 

o(1—u/M), where M is the maximum value of u that the environment can sustain, 

and ọ is the growth under unlimited access to resources (as in the beginning when u 

is small). The idea is that a (u) ~ @ when u is small and that a(t) > 0 as u > M. 
An a (u) that generalizes the linear choice is the polynomial form 


a(u) = e(1—u/M)?, (5.83) 
where p > 0 is some real number. 


a) Formulate a Forward Euler, Backward Euler, and a Crank-Nicolson scheme for 
(5.82). 


Hint Use a geometric mean approximation in the Crank-Nicolson scheme: 
[a(u)u]"*!/2 x a(u™)u"t!, 


b) Formulate Picard and Newton iteration for the Backward Euler scheme in a). 

c) Implement the numerical solution methods from a) and b). Use logistic.py 
to compare the case p = 1 and the choice (5.83). 

d) Implement unit tests that check the asymptotic limit of the solutions: u — M 
ast —> OOo. 


Hint You need to experiment to find what “infinite time” is (increases substantially 
with p) and what the appropriate tolerance is for testing the asymptotic limit. 


e) Perform experiments with Newton and Picard iteration for the model (5.83). See 
how sensitive the number of iterations is to At and p. 


Filename: logistic_p. 
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Problem 5.3: Experience the behavior of Newton’s method 
The program Newton_demo.py illustrates graphically each step in Newton’s 
method and is run like 


Terminal 


Terminal> python Newton_demo.py f dfdx x0 xmin xmax 


Use this program to investigate potential problems with Newton’s method when 
solving e 05x" cos(xx) = 0. Try a starting point x9 = 0.8 and x9 = 0.85 and 
watch the different behavior. Just run 


Terminal 


Terminal> python Newton_demo.py ’0.2 + exp(-0.5*x**2)*cos(pi*x)’ \ 
?-x*exp (-x**2)*cos(pi*x) - pi*exp(-x**2)*sin(pi*x)’ \ 
0.85 -3 3 


and repeat with 0.85 replaced by 0.8. 


Exercise 5.4: Compute the Jacobian of a 2 x 2 system 
Write up the system (5.18)—(5.19) in the form F(u) = 0, F = (Fo, Fi), u = 
(uo, u1), and compute the Jacobian J; ; = dF; /du;. 


Problem 5.5: Solve nonlinear equations arising from a vibration ODE 
Consider a nonlinear vibration problem 


mu” + bu'\u'| + s(u) = F(t), (5.84) 


where m > 0 is a constant, b > 0 is a constant, s(u) a possibly nonlinear function 
of u, and F(t) is a prescribed function. Such models arise from Newton’s second 
law of motion in mechanical vibration problems where s(u) is a spring or restoring 
force, mu” is mass times acceleration, and bu’|u'| models water or air drag. 


a) Rewrite the equation for u as a system of two first-order ODEs, and discretize 


this system by a Crank-Nicolson (centered difference) method. With v = uw’, 


we get a nonlinear term v"+2|v"+2|. Use a geometric average for v"+?. 

b) Formulate a Picard iteration method to solve the system of nonlinear algebraic 
equations. 

c) Explain how to apply Newton’s method to solve the nonlinear equations at each 
time level. Derive expressions for the Jacobian and the right-hand side in each 


Newton iteration. 


Filename: nonlin_vib. 


Exercise 5.6: Find the truncation error of arithmetic mean of products 
In Sect. 5.3.4 we introduce alternative arithmetic means of a product. Say the prod- 
uct is P(t) Q(t) evaluated at t = t,, 1. The exact value is 


N| 


[Po}"*2 = prtignts : 
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. . š 1 
There are two obvious candidates for evaluating [PQ]"*2 as a mean of values of P 
and Q at t, and t,+1. Either we can take the arithmetic mean of each factor P and 


Q, 
[POIt = 7 + pn) a" +o9"*1), (5.85) 


or we can take the arithmetic mean of the product PQ: 
1 
[Pot x 5 (P"Q" Æ parton); (5.86) 


The arithmetic average of P(t 1) is O(A?”): 


n+ 
1 
me n n+l 2 
P (tt) = PAPAROA: 


A fundamental question is whether (5.85) and (5.86) have different orders of ac- 
curacy in Af = f,4 | — fn. To investigate this question, expand quantities at tn+1 
and ¢, in Taylor series around t,, L, and subtract the true value [Po}"*2 from the 
approximations (5.85) and (5.86) to see what the order of the error terms are. 


Hint You may explore sympy for carrying out the tedious calculations. A general 
Taylor series expansion of P(t + At) around ¢ involving just a general function 
P(t) can be created as follows: 


>>> from sympy import * 

>>> t, dt = symbols(’t dt’) 

>>> P = symbols(’P’, cls=Function) 

>>> P(t).series(t, 0, 4) 

P(0) + t*Subs(Derivative(P(_x), _x), (_x,), (0,)) + 

t**2*Subs (Derivative(P(_x), _x, _x), (_x,), (0,))/2 + 

t**3+*Subs (Derivative(P(_x), _x, _x, _x), (_x,), (0,))/6 + O(t**4) 
>>> P_p = P(t).series(t, 0, 4).subs(t, dt/2) 

>>> P_p 

P(0) + dt*Subs(Derivative(P(_x), _x), (_x,), (0,))/2 + 

dt**2*Subs (Derivative(P(_x), _x, _x), (_x,), (0,))/8 + 
dt**3*Subs (Derivative(P(_x), _x, _x, _x), (_x,), (0,))/48 + O(dt**4) 


The error of the arithmetic mean, 5(P(—4At) + P(-$At)) for t = 0 is then 


>>> P_m = P(t).series(t, 0, 4).subs(t, -dt/2) 

>>> mean = Rational(1,2)*(P_m + P_p) 

>>> error = simplify(expand(mean) - P(0)) 

>>> error 

dt**2*Subs (Derivative(P(_x), _x, _x), (_x,), (0,))/8 + O(dt**4) 


Use these examples to investigate the error of (5.85) and (5.86) for n = 0. (Choos- 
ing n = 0 is necessary for not making the expressions too complicated for sympy, 
but there is of course no lack of generality by using n = 0 rather than an arbitrary 
n - the main point is the product and addition of Taylor series.) 

Filename: product_arith_mean. 


404 5 Nonlinear Problems 


Problem 5.7: Newton’s method for linear problems 

Suppose we have a linear system F(u) = Au — b = 0. Apply Newton’s method to 
this system, and show that the method converges in one iteration. 

Filename: Newton_linear. 


Problem 5.8: Discretize a 1D problem with a nonlinear coefficient 
We consider the problem 


(+u? =1, x€(0,1), u(0)=u(l) =0. (5.87) 


Discretize (5.87) by a centered finite difference method on a uniform mesh. 
Filename: nonlin_1D_coeff_discretize. 


Problem 5.9: Linearize a 1D problem with a nonlinear coefficient 
We have a two-point boundary value problem 


(1 +u2)w'y =1, x€(0,1), u0) =u(1)=0. (5.88) 


a) Construct a Picard iteration method for (5.88) without discretizing in space. 

b) Apply Newton’s method to (5.88) without discretizing in space. 

c) Discretize (5.88) by a centered finite difference scheme. Construct a Picard 
method for the resulting system of nonlinear algebraic equations. 

Discretize (5.88) by a centered finite difference scheme. Define the system 
of nonlinear algebraic equations, calculate the Jacobian, and set up Newton’s 
method for solving the system. 


d 


wm 


Filename: nonlin_1D_coeff_linearize. 


Problem 5.10: Finite differences for the 1D Bratu problem 
We address the so-called Bratu problem 


u” +e" =0, x€ (0,1), u(0) =u) =0, (5.89) 


where À is a given parameter and u is a function of x. This is a widely used model 
problem for studying numerical methods for nonlinear differential equations. The 
problem (5.89) has an exact solution 


i=in (=S 2 am) 


cosh(0/4) 


where 0 solves 


@ = V2) cosh(6/4). 


There are two solutions of (5.89) for 0 < A < A, and no solution for A > Àe. 
For A = A, there is one unique solution. The critical value A, solves 


1= y Tha sinh(0(A,)/4). 


A numerical value is A, = 3.513830719. 
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a) Discretize (5.89) by a centered finite difference method. 

b) Set up the nonlinear equations Fj (uo, u1,...,Uy,) = 0 from a). Calculate the 
associated Jacobian. 

c) Implement a solver that can compute u(x) using Newton’s method. Plot the 
error as a function of x in each iteration. 

d) Investigate whether Newton’s method gives second-order convergence by com- 
puting ||we—u||/||z%e —u7||? in each iteration, where u is solution in the current 
iteration and u” is the solution in the previous iteration. 


Filename: nonlin_1D_Bratu_fd. 


Problem 5.11: Discretize a nonlinear 1D heat conduction PDE by finite 
differences 
We address the 1D heat conduction PDE 


oc(T)T, = (k(T)T;)x, 


for x € [0, L], where o is the density of the solid material, c(T) is the heat capacity, 
T is the temperature, and k(T) is the heat conduction coefficient. T(x,0) = I(x), 
and ends are subject to a cooling law: 


k(T)T\|x=0 = h(T)(T a Ts), —k(T)Ty|x=1 = h(T)(T —T;), 


where A(T) is a heat transfer coefficient and T, is the given surrounding tempera- 
ture. 


Discretize this PDE in time using either a Backward Euler or Crank-Nicolson 
scheme. 

Formulate a Picard iteration method for the time-discrete problem (i.e., an iter- 
ation method before discretizing in space). 

c) Formulate a Newton method for the time-discrete problem in b). 

d) Discretize the PDE by a finite difference method in space. Derive the matrix and 
right-hand side of a Picard iteration method applied to the space-time discretized 
PDE. 

Derive the matrix and right-hand side of a Newton method applied to the dis- 
cretized PDE in d). 


a 


wm 


b 


wm 


e 


wm 


Filename: nonlin_1D_heat_FD. 


Problem 5.12: Differentiate a highly nonlinear term 

The operator V-(a@(u) Vu) with (u) = |Vu|? appears in several physical problems, 
especially flow of Non-Newtonian fluids. The expression |Vu| is defined as the 
Euclidean norm of a vector: |Vu|? = Vu - Vu. In a Newton method one has to 
carry out the differentiation da(u)/dc;, for u = >>, ce We. Show that 


ð 
—|Vul|? = q|Vu|1?Vu - VV; - 


Filename: nonlin_differentiate. 
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Exercise 5.13: Crank-Nicolson for a nonlinear 3D diffusion equation 
Redo Sect. 5.5.1 when a Crank-Nicolson scheme is used to discretize the equations 
in time and the problem is formulated for three spatial dimensions. 


Hint Express the Jacobian as Jj; 475 = Fi jk/9U,s rı and observe, as in the 2D 
case, that Jj jks is very sparse: Jj js, 4 Oonly forr =i ti,s = j +1, and 
t=k+1aswellasr =i,s = j,andt =k. 

Filename: nonlin_heat_FD_CN_2D. 


Problem 5.14: Find the sparsity of the Jacobian 

Consider a typical nonlinear Laplace term like V - a(u)Vu discretized by centered 
finite differences. Explain why the Jacobian corresponding to this term has the same 
sparsity pattern as the matrix associated with the corresponding linear term wV7u. 


Hint Set up the unknowns that enter the difference equation at a point (7, j) in 2D 
or (i, j,k) in 3D, and identify the nonzero entries of the Jacobian that can arise 
from such a type of difference equation. 

Filename: nonlin_sparsity_Jacobian. 


Problem 5.15: Investigate a 1D problem with a continuation method 
Flow of a pseudo-plastic power-law fluid between two flat plates can be modeled 
by 


n—1 du 
=-B, w'(0)=0, u(H) =0, 
dx 


where f > 0 and uo > 0 are constants. A target value of n may ben = 0.2. 


a) Formulate a Picard iteration method directly for the differential equation prob- 
lem. 

b) Perform a finite difference discretization of the problem in each Picard iteration. 
Implement a solver that can compute u on a mesh. Verify that the solver gives 
an exact solution for n = 1 on a uniform mesh regardless of the cell size. 

c) Given a sequence of decreasing n values, solve the problem for each n using 
the solution for the previous n as initial guess for the Picard iteration. This 
is called a continuation method. Experiment with n = (1,0.6,0.2) andn = 
(1, 0.9, 0.8,..., 0.2) and make a table of the number of Picard iterations versus 
n. 

d) Derive a Newton method at the differential equation level and discretize the 
resulting linear equations in each Newton iteration with the finite difference 
method. 

e) Investigate if Newton’s method has better convergence properties than Picard 
iteration, both in combination with a continuation method. 
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Useful Formulas 


A.1 Finite Difference Operator Notation 


(tn) © [D,u]" = A.l 
w (in) ~ [Dru] 7 (AD) 
U' (t,) © [Do,u]" = les u"! (A.2) 

n)™ 2t — 2At . 

n n—1 
—,,]n u —u 
u(t) = [D7u]" = ——— (A.3) 
E u”+! — u” 
U' (ty) X [Diu] = (A.4) 
n+l _ n 
“tn49) = (Duy? = ———* AS 
U'(tn46) = [Diu] At (A.5) 
3u” — 4y"-! 4 yr? 
(ty) ~ [Dau] = A.6 
W (On) ~ [D7] ve (A6) 
u”+! — 2u” + u”! 

"(t,) ~ [D,D,ul" = AT 
u (« ,) x [Tt = Lam +u”) (A.8) 

n+z5) X = 2 ʻ 

2 —_ 
u (nay) bP = we (A9) 
a thyn} _ 2 

u (r1) ~ fe ] 2 = qk (A.10) 
U(tr4.9) ~ [TP T? = ou"t! + 1 — 8)u", (A.11) 
tn+6 = Otn41 + a = O)tn—1 (A.12) 


Some may wonder why 0 is absent on the right-hand side of (A.5). The fraction 
is an approximation to the derivative at the point t,49 = Otn+1 + (1 — 0)tn. 
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A.2 Truncation Errors of Finite Difference Approximations 


ula) = [Dius] + R" = “2 ——2— +R", 

R” = -E utar + O(At*) (A.13) 
Un) = [Daye + R" = MEME pe, 

R" = “ON + O(At*) (A.14) 
Un) = [Drue] + R" = EME pr, 

R” = —Fweleyyat + O(At?) (A.15) 
oli) = [Df te)" + RY = “OME 4. Rr, 

RY = Sule, At + OAP) (A.16) 

z u” +! — y” 
We(tn+0) = [Drue] + R" = -e + RP, 


1 1 
RMP = 5 (1 — 26)uelinse)At + ZAC — 0)? — Oue (n40) At 


+ O(At) (A.17) 
z r a 3u” = Ay?! 4 yn? a 
We(tn) = [D u] + R" = —2— r +R", 
1 
R” = zre (Ar + O(At?) (A.18) 


n+l _ n n-1 
Us 2ug + Ue 


Ue (tn) = [D: Drue]" + R" = Ar +R”, 
R" = Sue GA? + O(Ar*) (A.19) 
Ueltn+o) = [mt t’ ¥ R"? = dunt} (i= 0)un 4 Rr, 
R't = Li (tye) A00 —0)+ O(At’). (A.20) 


~ 2 
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A.3 Finite Differences of Exponential Functions 


Complex exponentials Let u” = exp (i@nAt) = e", 


2 4 At 
[D, Diu)" = u" —(coswAt — 1) = -© sin? (>) (A.21) 

1 

[Diu] = u” Ay EP (iwAt) — 1), (A.22) 
1 

[D7 u]" = u” xC — exp (—iwAt)), (A.23) 
2 At 

[Diu}" = w" <i sin (>). (A.24) 
1 

[Du] = u'r sin (wAt). (A.25) 


Real exponentials Let u” = exp (øn At) = e®". 


2 4 At 
[D Diu)" = w" —(coswAt — 1) = —— sin? (=) ! (A.26) 

1 

(Ba = u” Ay EP (iwAt) — 1), (A.27) 
1 

[D7 u]" = u” 4G” — exp (—i@At)), (A.28) 
2 At 

[Diu]" = w" <i sin (=), (A.29) 
1 

[D>,u]" = u” re sin (wAf). (A.30) 


A.4 Finite Differences of t” 


The following results are useful when checking if a polynomial term in a solution 
fulfills the discrete equation for the numerical method. 


[Dj] =1, (A.31) 
[D7] = 1, (A.32) 
[D,t]" =1, (A.33) 
[Da:t]” = 1, (A.34) 


[D,D,t}" =0. (A.35) 
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The next formulas concern the action of difference operators on a £? term. 


[Di] = (2n + 1)At, (A.36) 
[D7 P] = Qn — 1)At, (A.37) 
[D] = 2nAt, (A.38) 
[Do,t7]" = 2nAt, (A.39) 
[D:D P =2. (A.40) 


Finally, we present formulas for a? term: 


DFCI = 3(nAt)* + 3nAt? + At’, (A.41) 
t 
DTP] = 3(nAt)* — 3nAt? + At’, (A.42) 
t 
1 
[D,t7]" = 3(nAt)? + ght, (A.43) 
ort = n t + t r K 
D>,t?]" = 3(nAt? + At? (A.44) 
[D:D] = 6nAt. (A.45) 


A.4.1 Software 


Application of finite difference operators to polynomials and exponential functions, 
resulting in the formulas above, can easily be computed by some sympy code (from 
the file Lib. py): 


from sympy import * 
t, dt, n, w = symbols(’t dt n w’, real=True) 


# Finite difference operators 


def D_t_forward(u): 
return (u(t + dt) - u(t))/dt 


def D_t_backward(u): 
return (u(t) - u(t-dt))/dt 


def D_t_centered(u): 
return (u(t + dt/2) - u(t-dt/2))/dt 


def D_2t_centered(u): 
return (u(t + dt) - u(t-dt))/(2*dt) 


def D_t_D_t(u): 


return (u(t + dt) - 2*u(t) + u(t-dt))/(dt**2) 


op_list = [D_t_forward, D_t_backward, 
D_t_centered, D_2t_centered, D_t_D_t] 


A.4 Finite Differences of t” 413 


def ft1i(t): 
return t 


def ft2(t): 
return t**2 


def ft3(t): 
return t**3 


def f_expiwt(t): 
return exp(I*w*t) 


def f_expwt(t): 
return exp(w*t) 


fune list = [ft1, ft2, ft3, f_expiwt, f_expwt] 


To see the results, one can now make a simple loop over the different types of 
functions and the various operators associated with them: 


for func in func list: 
for op in op_list: 


f = func 

e = op(f) 

e = simplify (expand(e)) 

print e 

if func in [f_expiwt, f_expwt]: 
e = e/f(t) 


e = e.subs(t, n*dt) 
print expand(e) 
print factor(simplify(expand(e))) 


Truncation Error Analysis 


Truncation error analysis provides a widely applicable framework for analyzing the 
accuracy of finite difference schemes. This type of analysis can also be used for 
finite element and finite volume methods if the discrete equations are written in 
finite difference form. The result of the analysis is an asymptotic estimate of the 
error in the scheme on the form Ch’, where h is a discretization parameter (At, Ax, 
etc.), r is a number, known as the convergence rate, and C is a constant, typically 
dependent on the derivatives of the exact solution. 

Knowing r gives understanding of the accuracy of the scheme. But maybe even 
more important, a powerful verification method for computer codes is to check 
that the empirically observed convergence rates in experiments coincide with the 
theoretical value of r found from truncation error analysis. 

The analysis can be carried out by hand, by symbolic software, and also nu- 
merically. All three methods will be illustrated. From examining the symbolic 
expressions of the truncation error we can add correction terms to the differential 
equations in order to increase the numerical accuracy. 

In general, the term truncation error refers to the discrepancy that arises from 
performing a finite number of steps to approximate a process with infinitely many 
steps. The term is used in a number of contexts, including truncation of infinite 
series, finite precision arithmetic, finite differences, and differential equations. We 
shall be concerned with computing truncation errors arising in finite difference for- 
mulas and in finite difference discretizations of differential equations. 


B.1 Overview of Truncation Error Analysis 
B.1.1 Abstract Problem Setting 
Consider an abstract differential equation 
L£(u) = 0, 
where £(u) is some formula involving the unknown u and its derivatives. One 


example is £(u) = u'(t)+a(t)u(t)—b(t), where a and b are constants or functions 
of time. We can discretize the differential equation and obtain a corresponding 
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discrete model, here written as 


The solution u of this equation is the numerical solution. To distinguish the numer- 
ical solution from the exact solution of the differential equation problem, we denote 
the latter by ve and write the differential equation and its discrete counterpart as 


L(ue) = 0, 
La(u) =0. 


Initial and/or boundary conditions can usually be left out of the truncation error 
analysis and are omitted in the following. 

The numerical solution u is, in a finite difference method, computed at a col- 
lection of mesh points. The discrete equations represented by the abstract equation 
L£,a(u) = 0 are usually algebraic equations involving u at some neighboring mesh 
points. 


B.1.2 Error Measures 


A key issue is how accurate the numerical solution is. The ultimate way of address- 
ing this issue would be to compute the error ue — u at the mesh points. This is 
usually extremely demanding. In very simplified problem settings we may, how- 
ever, manage to derive formulas for the numerical solution u, and therefore closed 
form expressions for the error ue — u. Such special cases can provide considerable 
insight regarding accuracy and stability, but the results are established for special 
problems. 

The error ue — u can be computed empirically in special cases where we know 
Ue. Such cases can be constructed by the method of manufactured solutions, where 
we choose some exact solution we = v and fit a source term f in the governing 
differential equation £L(ue) = f such that ue = v is a solution (i.e., f = L(v)). 
Assuming an error model of the form Ch’, where h is the discretization parame- 
ter, such as At or Ax, one can estimate the convergence rate r. This is a widely 
applicable procedure, but the validity of the results is, strictly speaking, tied to the 
chosen test problems. 

Another error measure arises by asking to what extent the exact solution ue fits 
the discrete equations. Clearly, we is in general not a solution of £4 (u) = 0, but we 
can define the residual 

R=La(ue), 


and investigate how close R is to zero. A small R means intuitively that the discrete 
equations are close to the differential equation, and then we are tempted to think 
that u” must also be close to ue(t,). 

The residual R is known as the truncation error of the finite difference scheme 
£a (u) = 0. It appears that the truncation error is relatively straightforward to 
compute by hand or symbolic software without specializing the differential equation 
and the discrete model to a special case. The resulting R is found as a power 
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series in the discretization parameters. The leading-order terms in the series provide 
an asymptotic measure of the accuracy of the numerical solution method (as the 
discretization parameters tend to zero). An advantage of truncation error analysis, 
compared to empirical estimation of convergence rates, or detailed analysis of a 
special problem with a mathematical expression for the numerical solution, is that 
the truncation error analysis reveals the accuracy of the various building blocks in 
the numerical method and how each building block impacts the overall accuracy. 
The analysis can therefore be used to detect building blocks with lower accuracy 
than the others. 

Knowing the truncation error or other error measures is important for verifica- 
tion of programs by empirically establishing convergence rates. The forthcoming 
text will provide many examples on how to compute truncation errors for finite 
difference discretizations of ODEs and PDEs. 


B.2 Truncation Errors in Finite Difference Formulas 


The accuracy of a finite difference formula is a fundamental issue when discretizing 
differential equations. We shall first go through a particular example in detail and 
thereafter list the truncation error in the most common finite difference approxima- 
tion formulas. 


B.2.1 Example: The Backward Difference for w’ (t) 


Consider a backward finite difference approximation of the first-order derivative u’: 


u 
[D7 u]" = x u(t). (B.1) 
Here, u” means the value of some function u(t) at a point f,, and [D7 u]" is the 
discrete derivative of u(t) att = t,. The discrete derivative computed by a finite 
difference is, in general, not exactly equal to the derivative u’(t,). The error in the 


approximation is 
R” = [D7 u]" — u' (ta). (B.2) 


The common way of calculating R” is to 


1. expand u(t) in a Taylor series around the point where the derivative is evaluated, 
here tn, 

2. insert this Taylor series in (B.2), and 

3. collect terms that cancel and simplify the expression. 


The result is an expression for R” in terms of a power series in At. The error R” is 
commonly referred to as the truncation error of the finite difference formula. 
The Taylor series formula often found in calculus books takes the form 


“1d 
KEDEDE 


i=0 


Qh . 
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In our application, we expand the Taylor series around the point where the finite 
difference formula approximates the derivative. The Taylor series of u” at t, is 
simply u(t, ), while the Taylor series of w”~! at t, must employ the general formula, 


Zd , 
Win) =u- At) = D a gr AN 


j= 


1 
= U(t,) — u(t, )At + zu” AC + 0(Ar*), 


where ©(Ar?) means a power-series in At where the lowest power is Ar?. We 
assume that Af is small such that At? >> Af? if p is smaller than q. The details 
of higher-order terms in Af are therefore not of much interest. Inserting the Taylor 
series above in the right-hand side of (B.2) gives rise to some algebra: 


ulta) — U(ty—-1) E 


[D7 u]" z u' (tn) = At u' (tn) 
U(tn) — (u (ta) — u' (ta) At + iu” (tn) At? + O(At?)) ; 
= —u (tn) 
At 
1 
= yu nAt + O(At’), 
which is, according to (B.2), the truncation error: 
n 1 n 2 
R= =" (t,)At + O(At*). (B.3) 


The dominating term for small At is — bu" (t,)At, which is proportional to At, and 


we say that the truncation error is of first order in At. 


B.2.2 Example: The Forward Difference for w’(t) 


We can analyze the approximation error in the forward difference 


n+l n 


=y 


n u 
u'(tı) ~ [D u] = a 


> 


by writing 
R” = [D7u]" = u' (tn), 


n+l 


and expanding u”*" in a Taylor series around ¢,,, 


1 
Wont) = W(t) + (At + 50") AP + OAD). 


The result becomes i 
R= se (in) At + O(A??), 


showing that also the forward difference is of first order. 
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B.2.3 Example: The Central Difference for w’(t) 


For the central difference approximation, 


1 1 
ut? — y"-2 
u'(t,) ~ [D,u]", [D,u]" = 
we write 
R” = [D,u]" —u'(t,), 


and expand u(t, , 1 ) and u(t, 1 ) in Taylor series around the point t, where the 
derivative is evaluated. We have 


1 1 1 \ 
= 4 = Zan = 
u Gay =u (ty) +u (n) 5At +F a (tn) (341) + 


lo, oye 1 i; f 
<ul" (ta) | -At —u" (ta) | -At 
5H" (5 ) + gunn (5 = 
: Cta) ae oar 
r20“ (2 i 


_ tye i. 1 y 
u (1,1) =u (tn) —u (n) 5At + z7” (ta) (3a) = 
1 


“Ct ) ae P 1 "Ct ) ae 
6 NG 24" wA 
L umg) lA T O(At®). 
120 2 
Now, 
1 1 

= 43! 4M 3 mn 5 7 

U(ty44) =u (tpg) = WAL + UA + Sou CA + O(AT). 


By collecting terms in [D;u]" — u'(t,) we find the truncation error to be 


R" = zg" DAP + O0(At’), (B.4) 


with only even powers of At. Since R ~ At? we say the centered difference is of 
second order in At. 
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B.2.4 Overview of Leading-Order Error Terms in Finite Difference 


Formulas 


Here we list the leading-order terms of the truncation errors associated with several 
common finite difference formulas for the first and second derivatives. 


yrts 


1 
— u”? 


[D,u]" = = = u!(t,) + R”, (B.5) 
R” = su A? + O(Ar') (B.6) 
[Dzu]" = 2 = = u' (ta) + R”, (B.7) 
R" = Fu) AP + O(At*) (B.8) 
[Druy = £ — = u(t) + R”, (B.9) 
R” = 50" (iat + O(At?) (B.10) 
[D}u]" = ee = u(t) + R”, (B.11) 
R" = su iat + O(At’) (B.12) 
[D,u]"*? = as = u' (trie) + RT, (B.13) 
RH = = 20)u" o) At = (= 8) = BUMP + OAP) 
(B.14) 
[Du] = ae oes ul (ty) + R”, (B.15) 
R” = = su AP + O(At?) (B.16) 
[D,D,u]" = cabs al ane u” (ty) + R", (B.17) 
R" = Su ar + O(At*) (B.18) 


It will also be convenient to have the truncation errors for various means or 
averages. The weighted arithmetic mean leads to 


ery 


R™ 


+O __ ĝu”! Ẹ a = 6)u" = U(tn+6) of Ree. (B.19) 


1 
tê 5H (na AOA — 0) + O(At?). (B.20) 
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The standard arithmetic mean follows from this formula when 6 = i Expressed at 


point t, we get 


1 
fay" = (uh utt) = ult) + R", 


2 
1 1 

R” = —-y" tn At? ee ty At? o At® . 
gU EDAP + Sau) At + OAS) 


The geometric mean also has an error O(A??): 
h] — u” ce = u”? + R", 
1 1 
R" = — qu ny Ar + ulin ye (AC + O(A’). 


The harmonic mean is also second-order accurate: 


2 


all =y" = ; + R+, 


nz Witt 
n — Ut)? 
4u (tn) 


1 
AP + eu" (mAC 


B.2.5 Software for Computing Truncation Errors 


(B.21) 


(B.22) 


(B.23) 


(B.24) 


(B.25) 


(B.26) 


We can use sympy to aid calculations with Taylor series. The derivatives can be 
defined as symbols, say D3f for the 3rd derivative of some function f. A truncated 
Taylor series can then be written as f + Dif*h + D2£*h**2/2. The following 
class takes some symbol f for the function in question and makes a list of symbols 
for the derivatives. The __cal1__ method computes the symbolic form of the series 


truncated at num_terms terms. 


import sympy as sym 


class TaylorSeries: 

"""Class for symbolic Taylor series.""" 

def __init__(self, f, num_terms=4): 
self ft = if 
self.N = num_terms 
# Introduce symbols for the derivatives 
self.df = [f] 
for i in range(1, self.N+1): 


self .df.append(sym.Symbol(’D/d/s’ % (i, £.name))) 


def call (self, h): 
"""Return the truncated Taylor series at xth.""" 
terms = self.f 
for i in range(1, self.N+1): 


terms += sym.Rational(1, sym.factorial(i))*self.df[i]*h**i 


return terms 
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We may, for example, use this class to compute the truncation error of the For- 
ward Euler finite difference formula: 


>>> from truncation_errors import TaylorSeries 

>>> from sympy import * 

>>> u, dt = symbols(’u dt’) 

>>> u_Taylor = TaylorSeries(u, 4) 

>>> u_Taylor (dt) 

Diuxdt + D2u*dt**2/2 + D3u*dt**3/6 + D4duxdt**4/24 + u 
>>> FE = (u_Taylor(dt) - u)/dt 

>>> FE 

(Diuxdt + D2uxdt**2/2 + D3uxdt**3/6 + D4uxdt**4/24) /dt 
>>> simplify (FE) 

Diu + D2u*xdt/2 + D3uxdt**2/6 + D4duxdt**3/24 


The truncation error consists of the terms after the first one (w’). 

The module file trunc/truncation_errors.py contains another class 
DiffOp with symbolic expressions for most of the truncation errors listed in 
the previous section. For example: 


>>> from truncation_errors import DiffOp 

>>> from sympy import * 

>>> u = Symbol(’u’) 

>>> diffop = DiffOp(u, independent_variable=’t’) 

>>> diffop[’ geometric_mean’] 

-Diu**2*dt**2/4 - DiuxD3uxdt**4/48 + D2u**2*dt**4/64 + ... 

>>> diffop[’Dtm’] 

Diu + D2u*xdt/2 + D3u*dt**2/6 + D4uxdt**3/24 

>>> >>> diffop.operator_names() 

[’geometric_mean’, ’harmonic_mean’, ’Dtm’, ’D2t’, ’DtDt’, 
>weighted_arithmetic_mean’, ’Dtp’, ’Dt’] 


The indexing of diffop applies names that correspond to the operators: Dtp for 
D;, Dtm for D;, Dt for D,, D2t for Dy, DtDt for D, Dy. 


B.3 Exponential Decay ODEs 
We shall now compute the truncation error of a finite difference scheme for a dif- 


ferential equation. Our first problem involves the following linear ODE that models 
exponential decay, 


u'(t) = —au(t). (B.27) 
B.3.1 Forward Euler Scheme 
We begin with the Forward Euler scheme for discretizing (B.27): 


[D u = —au]". (B.28) 


B.3 Exponential Decay ODEs 423 


The idea behind the truncation error computation is to insert the exact solution ue 
of the differential equation problem (B.27) in the discrete equations (B.28) and find 
the residual that arises because ue does not solve the discrete equations. Instead, ue 
solves the discrete equations with a residual R”: 


[Due + aue = RJ”. (B.29) 


From (B.11)-(B.12) it follows that 
[Di ue]" = u(t) + 7 (tr) At + O(A??’), 
which inserted in (B.29) results in 
ul (ta) + sult) At + O(At”) + aue(tn) = R” . 


Now, ub (tn) + aug = 0 since ue solves the differential equation. The remaining 
terms constitute the residual: 


1 
R” = eA + O(At’). (B.30) 


This is the truncation error R” of the Forward Euler scheme. 

Because R” is proportional to At, we say that the Forward Euler scheme is of 
first order in At. However, the truncation error is just one error measure, and it is 
not equal to the true error ug — u”. For this simple model problem we can compute 
arange of different error measures for the Forward Euler scheme, including the true 
error ug — u”, and all of them have dominating terms proportional to At. 


B.3.2  Crank-Nicolson Scheme 
For the Crank-Nicolson scheme, 
[Diu = —au]"*2, (B.31) 


we compute the truncation error by inserting the exact solution of the ODE and 
adding a residual R, 


[Due tate! = R]"*?. (B.32) 


The term [D uel"? is easily computed from (B.5)—(B.6) by replacing n with n + 5 
in the formula, 


[D;ue]" +? = u, (1,44) 4 uthi ( T Ar? + O(A’). 


The arithmetic mean is related to u(t, , 1 ) by (B.21)-(B.22) so 


[ate ets = = Ue (ti )+5 -ull At? + OCA’). 
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1 
Inserting these expressions in (B.32) and observing that ug (t, i ) + auk ua 0, 


because ue(t) solves the ODE u'(t) = —au(t) at any point t, we find that 
1 1 
Rit = (sax (4,43) $ TKO) Ar? + O(At*). (B.33) 


Here, the truncation error is of second order because the leading term in R is pro- 
portional to Ar’. 

At this point it is wise to redo some of the computations above to establish the 
truncation error of the Backward Euler scheme, see Exercise B.4. 


B.3.3 The 0-Rule 
We may also compute the truncation error of the 6-rule, 
[D,u = -amt ]" +? , 
Our computational task is to find R’*® in 
[Due + aug? = R+? . 


From (B.13)-(B.14) and (B.19)-(B.20) we get expressions for the terms with ue. 
Using that uh (tn+0) + aue(tn+9) = 0, we end up with 


1 1 
eS (5 = 0) ue(tn+o)At + z720 — OUK (n0) At? 
1 
+ 5 (0° — 0 + DUL (trae) At? + OCA). (B.34) 


For 6 = 5 the first-order term vanishes and the scheme is of second order, while 
for 0 # 5 we only have a first-order scheme. 


B.3.4 Using Symbolic Software 


The previously mentioned truncation_error module can be used to automate the 
Taylor series expansions and the process of collecting terms. Here is an example on 
possible use: 


from truncation_error import DiffOp 
from sympy import * 


def decay(): 
u, a = symbols(’u a’) 
diffop = DiffOp(u, independent_variable=’t’, 
num_terms_Taylor_series=3) 
diffop.D(1)  # symbol for du/dt 
Diu + a*u # define ODE 


Diu 
ODE 
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# Define schemes 
FE = diffop[’Dtp’] + a*u 
CN = diffop[’Dt’ ] + a*u 
BE = diffop[’Dtm’] + a*u 
theta = diffop[’barDt’] + a*diffop[’weighted_arithmetic_mean’] 
theta = sm.simplify(sm.expand(theta) ) 
# Residuals (truncation errors) 
R = {’FE’: FE-ODE, ’BE’: BE-ODE, ’CN’: CN-ODE, 
>theta’: theta-ODE} 
return R 


The returned dictionary becomes 


decay: { 

?BE’?: D2u*dt/2 + D3u*xdt**2/6, 

FE’: -D2u*dt/2 + D3u*dt**2/6, 

?CN’?: D3u*xdt**2/24, 

?theta’: -D2Qu*a*dt**2*theta**2/2 + D2u*xa*xdt**2*theta/2 - 
D2u*dt*theta + D2u*dt/2 + D3u*a*dt**3*theta**3/3 - 
D3uxaxdt**3*theta**2/2 + D3uxa*dt**3*theta/6 + 
D3u*xdt**2*theta**2/2 - D3u*dt**2*theta/2 + D3u*dt**2/6, 


The results are in correspondence with our hand-derived expressions. 


B.3.5 Empirical Verification of the Truncation Error 


The task of this section is to demonstrate how we can compute the truncation error 
R numerically. For example, the truncation error of the Forward Euler scheme 
applied to the decay ODE u’ = —ua is 


R” = [Due + aue]". (B.35) 


If we happen to know the exact solution ue(t), we can easily evaluate R” from the 
above formula. 

To estimate how R varies with the discretization parameter At, which has been 
our focus in the previous mathematical derivations, we first make the assumption 
that R = CAt" for appropriate constants C and r and small enough Aż. The rate 
r can be estimated from a series of experiments where At is varied. Suppose we 
have m experiments (At;, R;), i = 0,...,m— 1. For two consecutive experiments 
(At;_|, Rj) and (At;, R;), a corresponding r;_; can be estimated by 


; _ In(R;-1/R;) 
= BARGA) 38) 


fori = 1,...,m — 1. Note that the truncation error R; varies through the mesh, so 
(B.36) is to be applied pointwise. A complicating issue is that R; and R;_, refer to 
different meshes. Pointwise comparisons of the truncation error at a certain point in 
all meshes therefore requires any computed R to be restricted to the coarsest mesh 
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and that all finer meshes contain all the points in the coarsest mesh. Suppose we 
have No intervals in the coarsest mesh. Inserting a superscript n in (B.36), where n 
counts mesh points in the coarsest mesh, n = 0,..., No, leads to the formula 


n _ In(Ri_,/R?) 


"n ; B.37 
"i T RAGAG) eo?) 


Experiments are most conveniently defined by No and a number of refinements m. 
Suppose each mesh has twice as many cells N; as the previous one: 


No=ONy At =TN', 


where [0, T] is the total time interval for the computations. Suppose the computed 
R; values on the mesh with N; intervals are stored in an array R [i] (R being a list of 
arrays, one for each mesh). Restricting this R; function to the coarsest mesh means 
extracting every N; / No point and is done as follows: 


stride = N[i]/N_O 
R[i] = Ri] [::stride] 


The quantity R[i] [n] now corresponds to R”. 
In addition to estimating r for the pointwise values of R = CAt’, we may also 
consider an integrated quantity on mesh 7, 


T 


ie 


Ni 
Ry = (At (RIP) ~ J R,(t)dt. (B.38) 
n=0 0 
The sequence Rz;, i = 0,...,m— 1, is also expected to behave as CAt’, with the 


same r as for the pointwise quantity R, as At > 0. 
The function below computes the R; and Rz; quantities, plots them and com- 
pares with the theoretically derived truncation error (R_a) if available. 


import numpy as np 
import scitools.std as plt 


def estimate(truncation_error, T, N_O, m, makeplot=True): 
nun 
Compute the truncation error in a problem with one independent 
variable, using m meshes, and estimate the convergence 
rate of the truncation error. 


The user-supplied function truncation_error(dt, N) computes 
the truncation error on a uniform mesh with N intervals of 
length dt:: 


R, t, R_a = truncation_error (dt, N) 
where R holds the truncation error at points in the array t, 


and R_a are the corresponding theoretical truncation error 
values (None if not available). 
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The truncation_error function is run on a series of meshes 


with 2**i*N_O intervals, i=0,1,...,m-1. 


The values of R and R_a are restricted to the coarsest mesh. 
and based on these data, the convergence rate of R (pointwise) 
and time-integrated R can be estimated empirically. 


N = [2**i*N_O for i in range(m)] 


R_I = 

R = 
R_ 

dt = np 
legends 


[None] *m 
a = [None] *m 


.zeros (m) 


R = []; legends_R_a = 


for i in range(m): 
dt [i] = T/float(N[i]) 


R[i], t, R_ali] = truncation_error(dt[i], NEID 


RET 


if 


str 


np.zeros(m) # time-integrated R values on various meshes 
# time series of R restricted to coarsest mesh 
# time series of R_a restricted to coarsest mesh 


[] # all legends of curves 


[i] = np.sqrt (dt [i] *np.sum(R[i] **2)) 


i == 


t_coarse = t 


ide = N[i]/N_0 


R[i] = R[i][::stride] 
R_al[i] = R_ali][::stride] 


if makeplot: 


plt.figure(1) 
plt.plot(t_coarse, R[i] 
legends_R.append(’ N=/d’ 
plt.hold(’on’) 


plt.figure(2) 


# the coarsest mesh 


# restrict to coarsest mesh 


, log=’y’) 
% Nil) 


plt.plot(t_coarse, R_ali] - R[i], log=’y’) 


plt.hold(’on’) 


legends_R_a.append(’N=%d’ % N[i]) 


if makeplot: 


plt. 
plt. 
plt. 
plt. 
plt. 
plt. 
plt. 
plt. 
plt. 
plt. 
plt. 
plt. 


figure (1) 
xlabel(’time’) 


legend (legends_R) 
savefig(’R_series. png’) 
savefig(’R_series. pdf’) 
figure (2) 
xlabel(’time’) 


legend (legends_R_a) 
savefig(’R_error. png’) 
savefig(’R_error. pdf’) 


ylabel(’pointwise truncation error’) 


ylabel(’pointwise error in estimated truncation error’) 


427 


428 B Truncation Error Analysis 


# Convergence rates 

r_R_I = convergence_rates(dt, R_I) 

print ’R integrated in time; r:’, 

print 9 9 sao abn |L7hailae? YA ae scope pe aly e RETI) 

R = np.array(R) # two-dim. numpy array 

r_R = [convergence_rates(dt, R[:,n])[-1] 
for n in range(len(t_coarse))] 


The first makeplot block demonstrates how to build up two figures in parallel, 
using plt.figure(i) to create and switch to figure number i. Figure numbers 
start at 1. A logarithmic scale is used on the y axis since we expect that R as a 
function of time (or mesh points) is exponential. The reason is that the theoretical 
estimate (B.30) contains u%, which for the present model goes like e~“’. Taking the 
logarithm makes a straight line. 

The code follows closely the previously stated mathematical formulas, but the 
statements for computing the convergence rates might deserve an explanation. The 
generic help function convergence_rate(h, E) computes and returns 7;_,,7 = 
1,...,m — 1 from (B.37), given At; in h and R? in E: 


def convergence_rates(h, E): 
from math import log 
r = [log(E[i]/E[i-1])/log(h[i] /h[i-1]) 
for i in range(1, len(h))] 
return r 


Calling r_R_I = convergence_rates(dt, R_I) computes the sequence of 
rates ro, F1, ...,Fm—2 for the model R; ~ At", while the statements 


R = np.array(R) # two-dim. numpy array 
r_R = [convergence_rates(dt, R[:,n])[-1] 
for n in range(len(t_coarse) )] 


compute the final rate r,,_» for R” ~ At” at each mesh point f, in the coarsest 
mesh. This latter computation deserves more explanation. Since R [i] [n] holds the 
estimated truncation error R? on mesh i, at point t, in the coarsest mesh, RL: ,n] 
picks out the sequence R” fori = 0,...,m— 1. The convergence_rate func- 
tion computes the rates at ¢,, and by indexing [-1] on the returned array from 
convergence_rate, we pick the rate rm—2, which we believe is the best estimation 
since it is based on the two finest meshes. 

The estimate function is available in a module trunc_empir. py. Let us apply 
this function to estimate the truncation error of the Forward Euler scheme. We need 
a function decay_FE(dt, N) that can compute (B.35) at the points in a mesh with 
time step dt and N intervals: 
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Fig. B.1 Estimated truncation error at mesh points for different meshes 


import numpy as np 
import trunc_empir 


def decay_FE(dt, N): 
dt = float (dt) 
= np.linspace(0, N*dt, N+1) 
e = I*np.exp(-a*t) # exact solution, I and a are global 
=u_e # naming convention when writing up the scheme 


t 
We 
u 
R = np.zeros(N) 


for n in range(0, N): 
R[n] = (u[n+1] - u[n])/dt + a*u[n] 


# Theoretical expression for the trunction error 
R_a = 0.5*1*(-a) **2*np.exp (-a*t) *dt 


return R, t[:-1], R_aLl:-1] 


if _name == ’__main__’: 
I=1; a=2 # global variables needed in decay_FE 


trunc_empir.estimate(decay_FE, T=2.5, N_0=6, m=4, makeplot=True) 


The estimated rates for the integrated truncation error Ry become 1.1, 1.0, and 
1.0 for this sequence of four meshes. All the rates for R”, computed as r_R, are also 
very close to | at all mesh points. The agreement between the theoretical formula 
(B.30) and the computed quantity (ref(B.35)) is very good, as illustrated in Fig. B.1 
and B.2. The program trunc_decay_FE. py was used to perform the simulations 
and it can easily be modified to test other schemes (see also Exercise B.5). 
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Fig.B.2 Difference between theoretical and estimated truncation error at mesh points for different 


meshes 


B.3.6 Increasing the Accuracy by Adding Correction Terms 

Now we ask the question: can we add terms in the differential equation that can help 
increase the order of the truncation error? To be precise, let us revisit the Forward 
Euler scheme for u’ = —au, insert the exact solution ue, include a residual R, but 
also include new terms C: 


[D ue + aue = C + RÌ. (B.39) 


Inserting the Taylor expansions for [D;‘ue]" and keeping terms up to 3rd order in 
At gives the equation 


1 1 1 
zrelh)At = gue mar + aque Ar + O(At) = C" +R". 
Can we find C” such that R” is O(At’)? Yes, by setting 
n 1 n 
C= zen) At, 
we manage to cancel the first-order term and 


1 
2 3 
R” = Gite (n)At + O(At?). 
The correction term C” introduces 5 Atu’ ‘in the discrete equation, and we have 


to get rid of the derivative u”. One idea is to approximate u” by a second-order ac- 
curate finite difference formula, u” ~ (u”+!—2u” +u”7!)/ At?, but this introduces 
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an additional time level with u”~!. Another approach is to rewrite u” in terms of u’ 
or u using the ODE: 


u' = —au => u" = -au =—a(—au) = a°u. 


This means that we can simply set C” = la?’ Atu". We can then either solve the 
discrete equation 


1 n 
[Di = —au + zeA f (B.40) 


or we can equivalently discretize the perturbed ODE 
; n 7 1 
u =—âu, â=aļ|1l1— zat : (B.41) 


by a Forward Euler method. That is, we replace the original coefficient a by the 
perturbed coefficient å. Observe that â > a as At — 0. 
The Forward Euler method applied to (B.41) results in 


i 1 n 
D/u=-—a b= 57At uj . 


We can control our computations and verify that the truncation error of the scheme 
above is indeed O(A??). 

Another way of revealing the fact that the perturbed ODE leads to a more ac- 
curate solution is to look at the amplification factor. Our scheme can be written 
as 


1 
ut! = Au”, A=1-—GAt=1-—p+t+ 5P p=aAt, 


The amplification factor A as a function of p = aAt is seen to be the first three 
terms of the Taylor series for the exact amplification factor e~?. The Forward Euler 
scheme for u = —au gives only the first two terms 1 — p of the Taylor series for 
e`”. That is, using å increases the order of the accuracy in the amplification factor. 

Instead of replacing u” by a?u, we use the relation u” = —au’ and add a term 
—taAtu' in the ODE: 


1 1 
u' = —au — za Atu > (1 + zaar) u' = -au . 
2 2 
Using a Forward Euler method results in 


1 yet — u” 
1+ -aAt = —au", 
( 2 ) At 


which after some algebra can be written as 


1— Sat 
1+ Sat 


n+l = n 


u 
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This is the same formula as the one arising from a Crank-Nicolson scheme applied 
to u’ = —au! It is now recommended to do Exercise B.6 and repeat the above 
steps to see what kind of correction term is needed in the Backward Euler scheme 
to make it second order. 

The Crank-Nicolson scheme is a bit more challenging to analyze, but the ideas 
and techniques are the same. The discrete equation reads 


[D,u = —au]"*?2, 
and the truncation error is defined through 
[Drue si aue’ =C + RY"*2, 


where we have added a correction term. We need to Taylor expand both the dis- 
crete derivative and the arithmetic mean with aid of (B.5)—(B.6) and (B.21)-(B.22), 
respectively. The result is 


1 
sud (ny) AP + OAD + sue (tag) AP HOA = CMF + RMF, 
The goal now is to make C "+3 cancel the A?? terms: 


1 
cnt = xe (1,44) AP + sulin) At? : 


Using u’ = —au, we have that u” = a?u, and we find that u” = —a*u. We can 


therefore solve the perturbed ODE problem 
/ ^ A 1 2 2 
u=—au, â=a{]|1— —a At" |, 
12 


by the Crank-Nicolson scheme and obtain a method that is of fourth order in At. 
Exercise B.7 encourages you to implement these correction terms and calculate 
empirical convergence rates to verify that higher-order accuracy is indeed obtained 
in real computations. 


B.3.7 Extension to Variable Coefficients 
Let us address the decay ODE with variable coefficients, 
u'(t) = —a(t)u(t) + b(t), 
discretized by the Forward Euler scheme, 
[Du = —au +b)". (B.42) 


The truncation error R is as always found by inserting the exact solution ue(t) in 
the discrete scheme: 
[Due + aue —b = RÌ”. (B.43) 
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Using (B.11)-(B.12), 


1 
ul (ta) — grelh) At + O(At?) + a(tn)Ue(tr) — b(t) = R" . 
Because of the ODE, 


ug (tn) + A(tn)Ue(tn) — b(t) = 0, 


we are left with the result 
1 
R” = = etn) At + O(At?). (B.44) 


We see that the variable coefficients do not pose any additional difficulties in this 
case. Exercise B.8 takes the analysis above one step further to the Crank-Nicolson 
scheme. 


B.3.8 Exact Solutions of the Finite Difference Equations 


Having a mathematical expression for the numerical solution is very valuable in pro- 
gram verification, since we then know the exact numbers that the program should 
produce. Looking at the various formulas for the truncation errors in (B.5)-(B.6) 
and (B.25)-(B.26) in Sect. B.2.4, we see that all but two of the R expressions con- 
tain a second or higher order derivative of ue. The exceptions are the geometric and 
harmonic means where the truncation error involves u% and even ue in case of the 
harmonic mean. So, apart from these two means, choosing ue to be a linear func- 
tion of t, ue = ct + d for constants c and d, will make the truncation error vanish 
since u% = 0. Consequently, the truncation error of a finite difference scheme will 
be zero since the various approximations used will all be exact. This means that the 
linear solution is an exact solution of the discrete equations. 

In a particular differential equation problem, the reasoning above can be used to 
determine if we expect a linear ue to fulfill the discrete equations. To actually prove 
that this is true, we can either compute the truncation error and see that it vanishes, 
or we can simply insert ue(t) = ct + d in the scheme and see that it fulfills the 
equations. The latter method is usually the simplest. It will often be necessary to 
add some source term to the ODE in order to allow a linear solution. 

Many ODEs are discretized by centered differences. From Sect. B.2.4 we see 
that all the centered difference formulas have truncation errors involving u% or 
higher-order derivatives. A quadratic solution, e.g., ve(t) = t? + ct + d, will 
then make the truncation errors vanish. This observation can be used to test if a 
quadratic solution will fulfill the discrete equations. Note that a quadratic solution 
will not obey the equations for a Crank-Nicolson scheme for u’ = —au + b be- 
cause the approximation applies an arithmetic mean, which involves a truncation 
error with uw. 


434 B Truncation Error Analysis 
B.3.9 Computing Truncation Errors in Nonlinear Problems 


The general nonlinear ODE 
u' = f(u,t), (B.45) 


can be solved by a Crank-Nicolson scheme 
[Du = F+. (B.46) 


The truncation error is as always defined as the residual arising when inserting the 
exact solution uve in the scheme: 


[Dite— f = R+. (B.47) 


Using (B.21)-(B.22) for F results in 


-poai 1 
7 [rz = x (fue, tn) + futt, tn+1)) 
H(i) lejar 


With (B.5)-(B.6) the discrete equations (B.47) lead to 
1 m At? n+3 
Ue (c) + mae (3) t — J Ue slnl 
1 
= (: a 1) AP + 0(Ar*) = R", 


4+} 2 
Since ui (t, +: 1) -f (ue tat 1) = 0, the truncation error becomes 


= (Gu (t3) z= sue (i (1 el ,)) ae. At? 


The computational techniques worked well even for this nonlinear ODE. 


& 


R n+ 


B.4 Vibration ODEs 
B.41 Linear Model Without Damping 


The next example on computing the truncation error involves the following ODE 
for vibration problems: 
u(t) +o ult) = 0. (B.48) 


Here, w is a given constant. 
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The truncation error of a centered finite difference scheme Using a standard, 
second-ordered, central difference for the second-order derivative in time, we have 
the scheme 

[D;D;u + œu =0]". (B.49) 


Inserting the exact solution ve in this equation and adding a residual R so that 
Ue can fulfill the equation results in 


[D;D,ue + w7ue = RI". (B.50) 


To calculate the truncation error R”, we use (B.17)-(B.18), i.e., 
1 
[D;Dyue]" = ud(tn) + gae AC + O(At’), 
and the fact that u(t) + œ°ue(t) = 0. The result is 


1 
R” = que Ar + O(At*). (B.51) 
The truncation error of approximating u’(0) The initial conditions for (B.48) 
are u(0) = J and u/(0) = V. The latter involves a finite difference approximation. 


The standard choice 
[Do,u = vy’, 


where u`! is eliminated with the aid of the discretized ODE for n = 0, involves 
a centered difference with an O(A?7) truncation error given by (B.7)-(B.8). The 
simpler choice 

[Diu = V}, 


is based on a forward difference with a truncation error O(At). A central question 
is if this initial error will impact the order of the scheme throughout the simulation. 
Exercise B.11 asks you to perform an experiment to investigate this question. 


Truncation error of the equation for the first step We have shown that the trun- 
cation error of the difference used to approximate the initial condition u’(0) = 0 is 
O(At?), but we can also investigate the difference equation used for the first step. 
In a truncation error setting, the right way to view this equation is not to use the 
initial condition [D2,u = V]? to express u~! = u! — 2AtV in order to eliminate 
u`! from the discretized differential equation, but the other way around: the fun- 
damental equation is the discretized initial condition [D2,u = V]? and we use the 
discretized ODE [D; D; + w*u = 0]? to eliminate u~! in the discretized initial 
condition. From [D; D; + œu = 0]° we have 


u™! = 2u? — u! — Ato’, 
which inserted in [D2,u = V]° gives 


ul — u? 


At 


1 
+ z2 Atul =V. (B.52) 
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The first term can be recognized as a forward difference such that the equation can 
be written in operator notation as 


1 0 
[Di + so Anu = r| : 
The truncation error is defined as 
1 0 
| Dine + 50 Atte =y = r| i 
Using (B.11)-(B.12) with one more term in the Taylor series, we get that 
1 1 n 1 m 2 3 1 2 n 
uO) + z”eOAt + gute OAL + O(At?) + ao Atue(0) —V = R”. 
Now, w/,(0) = V and u4(0) = —w?ue(0) so we get 
1 
R” = gue OAL + O(At?). 


There is another way of analyzing the discrete initial condition, because elimi- 
nating u`! via the discretized ODE can be expressed as 


[Dou + At(D;D,u — ou) = VI. (B.53) 


Writing out (B.53) shows that the equation is equivalent to (B.52). The truncation 
error is defined by 


[Dose + At(D; Due — ue) = V + RP. 


Replacing the difference via (B.7)-(B.8) and (B.17)-(B.18), as well as using 
u4 (0) = V and ut (0) = —wue(0), gives 


1 
R” = gue AL + O(At?). 


Computing correction terms The idea of using correction terms to increase the 
order of R” can be applied as described in Sect. B.3.6. We look at 


[D; Due + wue = C + RY’, 
and observe that C” must be chosen to cancel the At? term in R”. That is, 
n 1 mw 2 
ct = pie (t,) At . 


To get rid of the 4th-order derivative we can use the differential equation: u” = 
—w7u, which implies u” = wtu. Adding the correction term to the ODE results in 


1 
u"! +w? (1 = pwr) u=0. (B.54) 
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Solving this equation by the standard scheme 
2 l 24,2 í 
D,Duto l=- pe ^ u=0], 


will result in a scheme with truncation error O (At). 

We can use another set of arguments to justify that (B.54) leads to a higher-order 
method. Mathematical analysis of the scheme (B.49) reveals that the numerical 
frequency @ is (approximately as At — 0) 


1 
b=H11+—w Ar |. 
oO o(1+ Ge ) 


One can therefore attempt to replace w in the ODE by a slightly smaller œw since the 
numerics will make it larger: 


n 


Expanding the squared term and omitting the higher-order term At* gives exactly 
the ODE (B.54). Experiments show that u” is computed to 4th order in At. You 
can confirm this by running a little program in the vib directory: 


from vib_undamped import convergence_rates, solver_adjust_w 


r = convergence_rates( 
m=5, solver_function=solver_adjust_w, num_periods=8) 


One will see that the rates r lie around 4. 


B.4.2 Model with Damping and Nonlinearity 


The model (B.48) can be extended to include damping fu’, a nonlinear restoring 
(spring) force s(u), and some known excitation force F(t): 


mu” + Bu' + s(u) = F(t). (B.55) 


The coefficient m usually represents the mass of the system. This governing equa- 
tion can be discretized by centered differences: 


[mD,D,u + Dazu + s(u) = FI”. (B.56) 
The exact solution we fulfills the discrete equations with a residual term: 


[mD,D,Ue + Daue + S(Ue) = F + RI”. (B.57) 
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Using (B.17)-(B.18) and (B.7)-(B.8) we get 
[mD,D,ue + BDz Ue)" = muk (tn) TE Bui (tn) 


m 
+ (Fro H 


furn) At? + O(t’). 


Combining this with the previous equation, we can collect the terms 
muk (tn) + BUg(tn) + o ueltn) + 5(Ue(tn)) — F”, 


and set this sum to zero because ue solves the differential equation. We are left with 
the truncation error 


B 


R” =_ E u (t) + g” 


pte ra) At? + o(Ar’), (B.58) 


so the scheme is of second order. 
According to (B.58), we can add correction terms 


C” = (Fo + B 


Pom bn At?, 
12 ge >) 


to the right-hand side of the ODE to obtain a fourth-order scheme. However, ex- 
pressing u”” and u” in terms of lower-order derivatives is now harder because the 
differential equation is more complicated: 


1 
u” = —(F' _ Bu" —s'(u)u'), 
m 


yl" 2 Le" _ Bu” _ s” (w) (u)? _ s'(u)u"), 
1 1 
= —(F" — B-(F' = Bu" _ s‘(u)u’) = s” (u) (u)? Pe s'(u)u”) : 
m m 


It is not impossible to discretize the resulting modified ODE, but it is up to debate 
whether correction terms are feasible and the way to go. Computing with a smaller 
At is usually always possible in these problems to achieve the desired accuracy. 


B.4.3 Extension to Quadratic Damping 


Instead of the linear damping term fu’ in (B.55) we now consider quadratic damp- 
ing Blu'|u’: 
mu” + Blu'|u' + s(u) = F(t). (B.59) 


A centered difference for u’ gives rise to a nonlinearity, which can be linearized us- 
: : Si 1 : 
ing a geometric mean: [|u'|u’]" ~ |[u’]"~2|[u’]"*2. The resulting scheme becomes 


[mD,D,u]" + B|[D,u}"~2|[D,u]"*2 + s(u") = F”. (B.60) 
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The truncation error is defined through 
1 1 
[mD,D,uel” + B\[Diute]"~2|[D ue]"*2 + s(ug)—-F" = R". (B.61) 


We start with expressing the truncation error of the geometric mean. According 
to (B.23)-(B.24), 


-1 nd n_ 1l 
[Drue]? [Devel t? = [|D;ue|Drue]” = greh Ar 
1 
4 ge Duel) At? + O(At*). 
Using (B.5)-(B.6) for the D;ve factors results in 


[Drue] Due” 


1 
= ju + aque DAP + O(At*) 


1 
(x + aque DAP + oa) . 


We can remove the absolute value since it essentially gives a factor 1 or —1 only. 
Calculating the product, we have the leading-order terms 


1 
[D;ueDrue]" = (Uhta) + TOOLIDE + O(At). 
With m 
m[D,D,ue]" = mug(tr) + que Ar? + O(At*), 


and using the differential equation on the form mu” + B(u’)* + s(u) = F, we end 
up with 


R” = (Pra) + Eue) AP + O(Ar’). 


This result demonstrates that we have second-order accuracy also with quadratic 
damping. The key elements that lead to the second-order accuracy is that the dif- 
ference approximations are © (At?) and the geometric mean approximation is also 
O(A?). 


B.4.4 The General Model Formulated as First-Order ODEs 
The second-order model (B.59) can be formulated as a first-order system, 
, 1 
v= z ZO — Bluly —s(u)), (B.62) 
ui=v. (B.63) 


The system (B.63)-(B.63) can be solved either by a forward-backward scheme (the 
Euler-Cromer method) or a centered scheme on a staggered mesh. 
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A centered scheme on a staggered mesh We now introduce a staggered mesh 
where we seek u at mesh points ¢, and v at points ¢,,1 in between the u points. 
2 


The staggered mesh makes it easy to formulate centered differences in the system 
(B.63)-(B.63): 


[D,u = v] ?, (B.64) 


[2 


: : z4 1 

The term |v"|v” causes trouble since v” is not computed, only v"~2 and v"t2. Us- 

ing geometric mean, we can express |v” |v” in terms of known quantities: |v” |v” ~ 
1 1 

|v"~2|v" +2. We then have 


TEO- Bolo = s0) | | (B.65) 


l 


[Diu]? = v2, (B.66) 


> (Fl) - 


ad 
p"-2 


[D,v]" 


ytd — s(u" )) (B.67) 
The truncation error in each equation fulfills 


=l 
[Due]? = Ve (1,1) + R; E 


Ve (1,-1) 


The truncation error of the centered differences is given by (B.5)-(B.6), and the 
geometric mean approximation analysis can be taken from (B.23)-(B.24). These 
results lead to 


[D;ve]" 


(Fn) -B 


Ve (+4) — s(u")) +R. 


1 
m 


, 1 m 2 4 n—5 
ul, (t4) +a (1,4) At? + O(At*) = ve (1,1) +R ?, 


5 


and i 
Velin) = —(F (tn) — Blve(in)|ve(tn) + O(At’) — s(u")) + RË . 


The ODEs fulfilled by ue and ve are evident in these equations, and we achieve 
second-order accuracy for the truncation error in both equations: 


n—-1 
R, ?=O(AP), R! =0(A®P). 


B.5 Wave Equations 
B.5.1 Linear Wave Equation in 1D 


The standard, linear wave equation in 1D for a function u(x,t) reads 


2 02 
a = co + f(x,t), xe(0,L), re@,7), (B.68) 
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where c is the constant wave velocity of the physical medium in [0, L]. The equation 
can also be more compactly written as 


Un = C usx + f, xe (0, L), te (0,T]. (B.69) 


Centered, second-order finite differences are a natural choice for discretizing the 
derivatives, leading to 


[D;D;u = ° D, Dyu + ff}. (B.70) 


Inserting the exact solution ue(x, t) in (B.70) makes this function fulfill the equa- 
tion if we add the term R: 


[D:D;ue = c° Dy Drue + f + RI. (B.71) 


Our purpose is to calculate the truncation error R. From (B.17)-(B.18) we have 
that 


1 
[D; Drue]; = vert (Xi. tn) + gaei (Xi AL F O(At), 


when we use a notation taking into account that ue is a function of two variables 
and that derivatives must be partial derivatives. The notation ve ;; means 3?ue/3ðt?. 
The same formula may also be applied to the x-derivative term: 


1 
[Dx Duel} = Ue xx (xi, tn) F q eax (xi, tn) Ax? F O(Ax*) 7 
Equation (B.71) now becomes 


1 1 
Uet + Tp Mente Ois tr) At? = Cas + Co zerras tn) Ax? E fxi, tn) 
+ O(Att, Axt) + R”. 


Because we fulfills the partial differential equation (PDE) (B.69), the first, third, and 
fifth term cancel out, and we are left with 


n_ 1 : 2 2 1 . 2 4 4 
R = grenn Xi n)At —c qpe is n) Ax + O(At*, Ax"), (B.72) 


showing that the scheme (B.70) is of second order in the time and space mesh 
spacing. 
B.5.2 Finding Correction Terms 


Can we add correction terms to the PDE and increase the order of R? in (B.72)? 
The starting point is 


[D,Djue = c?D, Due + f +C + RI. (B.73) 
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From the previous analysis we simply get (B.72) again, but now with C: 


1 1 
R? +C? = qaet i MAC -e Taea (i, tn) Ax? + O(At*, Ax‘). 
(B.74) 
The idea is to let C;" cancel the At? and Ax? terms to make R? = O(Att, Ax”: 


1 1 
c? = prenis tn) At? L C? Tz Meaxeex (xi, tn) Ax? > 


Essentially, it means that we add a new term 
1 

C= 
12 ( 


to the right-hand side of the PDE. We must either discretize these 4th-order deriva- 
tives directly or rewrite them in terms of lower-order derivatives with the aid of the 
PDE. The latter approach is more feasible. From the PDE we have the operator 
equality 


2 2 
Uren At? — CUxxxx AX ), 


Toa 
at? Ox?’ 
so 


2 —2 
Utttt =C Uxxtt, Uxxxx =C “Uttxx - 


Assuming u is smooth enough, so that Uxxtt = Urrxx, these relations lead to 
= (ERP = Ax*)u,,) 
12 xx/tt = 
A natural discretization is 
1 
Cs TAG — Ax?)[D,D,D,D,u]?. 
Writing out [Dy Dx D; Dru]; as [Dx Dx(D;Dru)]; gives 


n+l n 
1 (“= — 2ui HUI 


At? Ax? 
ou we — 2u! J i 1 p u = 2u” $ aa 
Ax? Ax? 
Now the unknown values ut}, u?*!, and u?! are coupled, and we must solve a 


tridiagonal system to find them. This i is in crinciple straightforward, but it results 
in an implicit finite difference scheme, while we had a convenient explicit scheme 
without the correction terms. 


B.5.3 Extension to Variable Coefficients 


Now we address the variable coefficient version of the linear 1D wave equation, 


3u ə ðu 
a ax (2 a 
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or written more compactly as 
Ut = (Aux) . (B.75) 


The discrete counterpart to this equation, using arithmetic mean for A and centered 


differences, reads 
n 


|D; Duu = DX Dyu| l (B.76) 


i 


The truncation error is the residual R in the equation 


|D, Dine = D,X` Die + R| (B.77) 


n 
i 


The difficulty with (B.77) is how to compute the truncation error of the term 
[DA Dy uel! 
We start by writing out the outer operator: 
—x n 1 —x n 
[DA Dxue] = [2 Dxue] 
i Ax i+} 
With the aid of (B.5)—-(B.6) and (B.21)-(B.22) we have 


— [z Dxue] 


n 
i 


) : (B.78) 


I 
2 


[Dxue]; = vex (aai) T agile (rapt) Apoa 
[E], = A (ming) + EA (nny) A? + O14, 
[pue], = (a (xs) + 52” (Hag) Art (a) 


x (ve. (1545) + Te (aapt) Ax? + oca) 


=A (rap) Hox (zapta) +4 (Sins) zaree (sapt) a 
+ Uex (ab) y (x44) Ax? + O(Ax*) 
= [dues]; + Gi Ax? + O(Ax*), 


where we have introduced the short form 


n 1 1 
G = gter (acit) À (xia) + uex (cys) rue (x43) i 


Similarly, we find that 


[T Dxue] = [Awex]” ı + G” Ax? + O(Ax*). 


Inserting these expressions in the outer operator (B.78) results in 


[D.2* Dxue] — ~ ([F pu], — [F pu] ,) 


2 


1 
=— (Duet, , +G} Ax? — [tex]! -G7 Ax? + o(Ax')) 


2 


[Dxue x] + [DG]; Ax? + O(Ax’). 
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The reason for O(A x*) in the remainder is that there are coefficients in front of this 
term, say HAx*, and the subtraction and division by Ax results in [D, H] Ax‘. 

We can now use (B.5)-(B.6) to express the D, operator in [Dxue]; as a 
derivative and a truncation error: 


ð 1 
[Dràtesl; = TAi )te (is ta) + 5 Ater )rrs (ti, tn) Ax? + O(Ax’), 
ax 24 
Expressions like [DG]? Ax? can be treated in an identical way, 
1 
[D;G]} Ax? = Gx (Xi, tn Ax? F zg re i ia F O(Ax*) : 


There will be a number of terms with the Ax? factor. We lump these now into 
O(Ax?). The result of the truncation error analysis of the spatial derivative is there- 
fore summarized as 


n 
i 


<x ð 
[D7 Drue] = Aue. (tis t) + OAP). 
x 
After having treated the [D, D,ue]} term as well, we achieve 
n 2 1 2 
R; = O(Ax*) + qaren (i, tn)At . 


The main conclusion is that the scheme is of second-order in time and space also in 
this variable coefficient case. The key ingredients for second order are the centered 
differences and the arithmetic mean for A: all those building blocks feature second- 
order accuracy. 


B.5.4 Linear Wave Equation in 2D/3D 


The two-dimensional extension of (B.68) takes the form 
3u 2 u 8u 
a = (= F =) Ii y,t), (x,y) e€ (0, L) x (0, H), t € (0,7), 
(B.79) 


where now c(x, y) is the constant wave velocity of the physical medium [0, L] x 
[0, H]. In compact notation, the PDE (B.79) can be written 


Ur = C’ (Uxx + Uyy) + f(x, y,t), (x,y) € (0, L) x (0, H), t € (0, T], (B.80) 
in 2D, while the 3D version reads 
Ut = C?(Uxy + Uyy + Uz2) + f(x, y,Z,t), (B.81) 
for (x, y,z) € (0, L) x (0, H) x (0, B) and t € (0, T]. 
Approximating the second-order derivatives by the standard formulas (B.17)— 


(B.18) yields the scheme 


[D,D,u = c?(D,Dyu + DyDyu + D:D.) + fly. (B.82) 
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The truncation error is found from 
[D; Die = c?(D,Dyue + DyDyue + DzDzue) + f + RY} jy. (B83) 


The calculations from the 1D case can be repeated with the terms in the y and z 
directions. Collecting terms that fulfill the PDE, we end up with 


1 1 n 
Ri jk = Fea > a (is yeux 7 He AX F Mezz: 42?) 
i,j,k 
+ O(At*, Axt, Ay*, Az*). 
(B.84) 

B.6 Diffusion Equations 
B.6.1 Linear Diffusion Equation in 1D 
The standard, linear, 1D diffusion equation takes the form 

ðu 3u 

— =Q + f(x,t), xe(0,L),te(0,T], (B.85) 

ot əx? 


where a > 0 is a constant diffusion coefficient. A more compact form of the 
diffusion equation is u; = aux, + f. 

The spatial derivative in the diffusion equation, «uxx, is commonly discretized 
as [D, D,u]}. The time-derivative, however, can be treated by a variety of methods. 


The Forward Euler scheme in time Let us start with the simple Forward Euler 
scheme: 
[Diu = «D, Dyu + fI. 


The truncation error arises as the residual R when inserting the exact solution ue in 
the discrete equations: 


[D] ue =aD,Dyue+ f + RI}. 


Now, using (B.11)-(B.12) and (B.17)-(B.18), we can transform the difference op- 
erators to derivatives: 


1 
Uest(Xi, tn) + zver (in)At + 0(At°) 
= Ue xx (Xj, tn) + TVen tr) Ax? + O(Ax*) T fxi, tn) F R? + 


The terms ve; (Xj, tn) —&uUe xx (Xi, tn)— f (xi, tn) vanish because ue solves the PDE. 
The truncation error then becomes 


fal 2 od 2 4 
R; = zren (n)At + O(At )— Tp Mevaxx i hn) Ax F O(Ax ). 


t 
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The Crank-Nicolson scheme in time The Crank-Nicolson method consists of us- 
ing a centered difference for u, and an arithmetic average of the uxx term: 


n 1 n 
Da = az ([Dx Dru]? + [Dx Dru] t") + f t, 
The equation for the truncation error is 


n+y 


n 1 n 1 
[Due]; 2? = w- ([D, Drue]? + [Dx Drue] t) + f + R. 


L 


a 
"2 
To find the truncation error, we start by expressing the arithmetic average in terms 
of values at time t, 1. According to (B.21)-(B.22), 


1 n+ 1 n+ 
2 ([D, Dx uel} F [D; Dxue]; tt) = [D; Drue]; ? ig zls Druen]; AP 
+ O(At*). 


With (B.17)-(B.18) we can express the difference operator D, D,u in terms of a 
derivative: 


n 1 
[D, Due]; Re = Users (aia) + pre (xit al 1) Ax? + O(Ax*). 


The error term from the arithmetic mean is similarly expanded, 


1 nl 1 
[DDt] A = Uestr (ant ae jar + O(APAX?). 


The time derivative is analyzed using (B.5)-(B.6): 


[Du]? = = Uet (tus) F stent (xit nth 1) At? F o(Art ): 


Summing up all the contributions and notifying that 


Yet (xita) = Aue xx (x: n+h 1) F f (xit n+4 i), 


the truncation error is given by 


n+} 


1 1 
R= ges Cae 1) Mir + Te XXX CAM 1) Ax? 


1 
+ spMen C mi 1) Ar + O(Ax*) + O(At*) + O(APAX?). 


B.6.2 Nonlinear Diffusion Equation in 1D 


We address the PDE 
ou 


ð du 
at ax (ows) + fu), 
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with two potentially nonlinear coefficients q (u) and œ (u). We use a Backward Euler 
scheme with arithmetic mean for a(u), 


[Dow = D,a(u) Dyu + fo] 


n 
i 
Inserting ue defines the truncation error R: 


| Do we = Daue) Dxue + flue) + R| 

The most computationally challenging part is the variable coefficient with œ (u), 
but we can use the same setup as in Sect. B.5.3 and arrive at a truncation error 
O(Ax’) for the x-derivative term. The nonlinear term [f(we)]? = f(ue(xi, tn)) 
matches x and ¢f derivatives of ue in the PDE. We end up with 


2 


1a 
R? = 5 gate itn) At + O(Ax’). 


B.7 Exercises 


Exercise B.1: Truncation error of a weighted mean 
Derive the truncation error of the weighted mean in (B.19)-(B.20). 


Hint Expand už+! and u% around f, +9. 


Filename: trunc_weighted_mean. 


Exercise B.2: Simulate the error of a weighted mean 
We consider the weighted mean 


Ue(tr) © Ourt! + (1 — 0u}. 


Choose some specific function for ue(t) and compute the error in this approxima- 
tion for a sequence of decreasing At = ¢,4, — ftn and for 0 = 0, 0.25, 0.5, 0.75, 1. 
Assuming that the error equals CA?t’, for some constants C and r, compute r for 
the two smallest At values for each choice of 6 and compare with the truncation 
error (B.19)-(B.20). 

Filename: trunc_theta_avg. 


Exercise B.3: Verify a truncation error formula 
Set up a numerical experiment as explained in Sect. B.3.5 for verifying the formulas 
(B.15)-(B.16). 


Filename: trunc_backward_2level. 


Problem B.4: Truncation error of the Backward Euler scheme 

Derive the truncation error of the Backward Euler scheme for the decay ODE u’ = 
—au with constant a. Extend the analysis to cover the variable-coefficient case 
u' = —a(t)u + b(t). 

Filename: trunc_decay_BE. 
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Exercise B.5: Empirical estimation of truncation errors 

Use the ideas and tools from Sect. B.3.5 to estimate the rate of the truncation error of 
the Backward Euler and Crank-Nicolson schemes applied to the exponential decay 
model u’ = —au, u(0) = T. 


Hint In the Backward Euler scheme, the truncation error can be estimated at mesh 
points n = 1,..., N, while the truncation error must be estimated at midpoints 
trots n = O,...,N — 1 for the Crank-Nicolson scheme. The truncation_ 
error(dt, N) function to be supplied to the estimate function needs to care- 
fully implement these details and return the right t array such that t [i] is the time 
point corresponding to the quantities R[i] and R_a[i]. 

Filename: trunc_decay_BNCN. 


Exercise B.6: Correction term for a Backward Euler scheme 

Consider the model u’ = —au, u(0) = I. Use the ideas of Sect. B.3.6 to add a 
correction term to the ODE such that the Backward Euler scheme applied to the 
perturbed ODE problem is of second order in At. Find the amplification factor. 
Filename: trunc_decay_BE_corr. 


Problem B.7: Verify the effect of correction terms 

Make a program that solves u’ = —au, u(0) = J, by the -rule and computes 
convergence rates. Adjust a such that it incorporates correction terms. Run the 
program to verify that the error from the Forward and Backward Euler schemes 
with perturbed a is O(A?t?), while the error arising from the Crank-Nicolson scheme 
with perturbed a is O(At*). 

Filename: trunc_decay_corr_verify. 


Problem B.8: Truncation error of the Crank-Nicolson scheme 
The variable-coefficient ODE u’ = —a(t)u+b(t) can be discretized in two different 


ways by the Crank-Nicolson scheme, depending on whether we use averages for a 
and b or compute them at the midpoint t, 4: 


[Diu = -aT + b+, (B.86) 


[Diu = <a] i (B.87) 


Compute the truncation error in both cases. 
Filename: trunc_decay_CN_vc. 


Problem B.9: Truncation error of uv’ = f(u, t) 
Consider the general nonlinear first-order scalar ODE 


u'(t) = f(u(),t). 
Show that the truncation error in the Forward Euler scheme, 


[Diu = f(u, t)", 
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and in the Backward Euler scheme, 
[Dru = f(u, D)”, 


both are of first order, regardless of what f is. 
Showing the order of the truncation error in the Crank-Nicolson scheme, 


[Diu = fu, p+, 


is somewhat more involved: Taylor expand už, uire f(u}, tn), and FG) 


around ¢,, 4 1, and use that 

df _ af, , af 

dt du ot 
Check that the derived truncation error is consistent with previous results for the 
case f(u,t) = —au. 


Filename: trunc_nonlinear_ODE. 


Exercise B.10: Truncation error of [ D, D,u]" 

Derive the truncation error of the finite difference approximation (B.17)-(B.18) to 
the second-order derivative. 

Filename: trunc_d2u. 


Exercise B.11: Investigate the impact of approximating u’ (0) 

Section B.4.1 describes two ways of discretizing the initial condition u’(0) = V for 
a vibration model u” + w*u = 0: a centered difference [D2,u = V]? or a forward 
difference [D u = V]°. The program vib_undamped.py solves u” + w*u = 0 
with [Dau = 0]° and features a function convergence_rates for computing 
the order of the error in the numerical solution. Modify this program such that it 
applies the forward difference [D] u = 0]? and report how this simpler and more 
convenient approximation impacts the overall convergence rate of the scheme. 
Filename: trunc_vib_ic_fw. 


Problem B.12: Investigate the accuracy of a simplified scheme 
Consider the ODE 
mu” + Blu'|u’ + s(u) = F(t). 


The term |u'|u’ quickly gives rise to nonlinearities and complicates the scheme. 
Why not simply apply a backward difference to this term such that it only involves 
known values? That is, we propose to solve 


[mD,D,u+ |D u|D7u + slu) = FI. 


Drop the absolute value for simplicity and find the truncation error of the scheme. 
Perform numerical experiments with the scheme and compared with the one based 
on centered differences. Can you illustrate the accuracy loss visually in real com- 
putations, or is the asymptotic analysis here mainly of theoretical interest? 
Filename: trunc_vib_bw_damping. 
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C.1 A1D Wave Equation Simulator 
C.1.1 Mathematical Model 
Let u+, Utt, Ux, Uxx denote derivatives of u with respect to the subscript, i.e., Uss is 


a second-order time derivative and ux is a first-order space derivative. The initial- 
boundary value problem implemented in the wave1D_dn_vc. py code is 


un = (qx)uUy)x + ft), x€(0,L),1€0,T] (C1 
u(x,0) = I(x), x € [0, L] (C.2) 
u,(x,0) = V(t), x € [0, L] (C.3) 
u(0,t) = Up(t) or u,(0,t) = 0, te (0,T] (C.4) 
u(L,t) = U(t) or u,(L,t) =0, t € (0,T]. (C.5) 


We allow variable wave velocity c?(x) = q(x), and Dirichlet or homogeneous 
Neumann conditions at the boundaries. 
C.1.2 Numerical Discretization 


The PDE is discretized by second-order finite differences in time and space, with 
arithmetic mean for the variable coefficient 


[D,D,u = D; Dru + fi. (C.6) 
The Neumann boundary conditions are discretized by 
[D>,u]? = 0, 


at a boundary point i. The details of how the numerical scheme is worked out are 
described in Sect. 2.6 and 2.7. 
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C.1.3 A Solver Function 


The general initial-boundary value problem (C.1)-(C.5) solved by finite difference 
methods can be implemented as shown in the following solver function (taken 
from the file wave1D_dn_vc.py). This function builds on simpler versions de- 
scribed in Sect. 2.3, 2.4 2.6, and 2.7. There are several quite advanced constructs 
that will be commented upon later. The code is lengthy, but that is because we pro- 
vide a lot of flexibility with respect to input arguments, boundary conditions, and 
optimization (scalar versus vectorized loops). 


def solver( 
boo. aa ea UL0y Unt, Eat. ey 7, 
user_action=None, version=’scalar’, 
stability_safety_factor=1.0): 
"""Solve u_tt=(c72*u_x)_x + f on (0,L)x(0,T].""" 


# --- Compute time and space mesh --- 
Nt = int (round(T/dt)) 
t = np.linspace(0, Nt*dt, Nt+1) # Mesh points in time 


# Find max(c) using a fake mesh and adapt dx to C and dt 
if isinstance(c, (float,int)): 

c_max = cc 
elif callable(c): 

c_max = max([c(x_) for x_ in np.linspace(0, L, 101)]) 
dx = dt*c_max/(stability_safety_factor*C) 
Nx = int (round(L/dx) ) 
x = np.linspace(0, L, Nx+1) # Mesh points in space 
# Make sure dx and dt are compatible with x and t 
dx = x[1] - x[0] 
dt = t[1] = t[0] 


# Make c(x) available as array 
if isinstance(c, (float,int)): 

c = np.zeros(x.shape) + c 
elif callable(c): 

# Call c(x) and fill array c 

c_ = np.zeros(x.shape) 

for i in range(Nx+1): 


ekul = ¢(xfi]) 


Ca=ace 
q= c**2 
C2 = (dt/dx)**2; dt2 = dt*dt # Help variables in the scheme 
H- Wrap user- -given £ I; Vij, U10; U LCiif Noneror 0- 
if f is None or f == 0: 

f = (lambda x, t: 0) if version == ’scalar’ else \ 

lambda x, t: np.zeros(x.shape) 

if I is None or I == 0: 

I = (lambda x: 0) if version == ’scalar’ else \ 


lambda x: np.zeros(x.shape) 
if V is None or V == 
V = (lambda x: 0) if version == ’scalar’ else \ 
lambda x: np.zeros(x.shape) 


G: 
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if U_O is not None: 
if isinstance(U_O, (float,int)) and U_O == 0: 
U_O = lambda t: 0 
if U_L is not None: 


if isinstance(U_L, (float,int)) and U_L == 0: 
U_L = lambda t: 0 
# --- Make hash of all input data --- 
import hashlib, inspect 
data = inspect.getsource(I) + ’_’ + inspect.getsource(V) + \ 
9 eo GIRTON OME GE)) sp 9 _? sp wiee(@)) a 7 _? ap \ 


(C None’ if U_O is None else inspect.getsource(U_0)) + \ 
None’ if U_L is None else inspect.getsource(U_L)) + \ 
M2) ab ihere((t)) ap etr Cd cr YY ep Str O a 22 ab nee(G) a> \\ 
» ? + str(stability_safety_factor) 
hashed_input = hashlib.shal (data) .hexdigest () 
if os.path.isfile(’.’ + hashed_input + ’_archive.npz’): 
# Simulation is already run 
return -1, hashed_input 


# --- Allocate memomry for solutions --- 
u = np.zeros(Nx+1) # Solution array at new time level 
u_n = np.zeros(Nxt1) # Solution at 1 time level back 


u_nmi = np.zeros(Nx+1)  # Solution at 2 time levels back 
import time; tO = time.clock() # CPU time measurement 


# --- Valid indices for space and time mesh --- 
Ix = range(0, Nx+1) 
It = range(O, Nt+1) 


# ——— Load) initial condition into u n =-= 
for i in range(0,Nx+1): 
u nlil = TCH 


if user_action is not None: 
user action uin. a CO) 


# --- Special formula for the first step --- 
fori in zii i: 
uli] = u_n[i] + dt*V(x[i]) + \ 
0.5*C2*(0.5*(qfli] + qflit+i])*(u_n[it1] - u_n[i]) - \ 
0.5*(q[i] + gq[i-1])*(u_n[i] - u_n[i-1])) + \ 
0.5*dt2*f(x[i], t[0]) 


i = Ix[0] 
if U_O is None: 
# Set boundary values (x=0: i-1 -> i+1 since u[i-1]=u[it1] 
# when du/dn = 0, on x=L: i+1 -> i-1 since u[i+i]=u[i-1]) 
ipl = i+1 
imi Skip AOSS 
uli] = u_n[i] + dt*V(x[i]) + \ 
0.5*C2*(0.5*(qLli] + qlip1])*(u_n[ip1] - u_n[i]) - \ 
0.5*(q{i] + q{imi])*(u_n[i] - u_n[imi])) + \ 
0.5*dt2*f(x[i], t[0]) 
else: 
uli] = U_O(dt) 


454 


C Software Engineering; Wave Equation Model 


al se (El) 
if U_L is None: 
imi = i-1 


ipl = imt # i+1 -> i-1 
uli] = u_n[i] + dt*V(x[i]) + \ 


0.5*C2*(0.5*(q[i] + qlipi])*(u_n[ipi] - u_n[i]) 


ZN 


0.5*(qfi] + qlim1])*(u_n[i] - u_n[imi])) + \ 


0.5*dt2*f(x[i], t[0]) 
else: 
uli] = U_L(dt) 


if user_action is not None: 
user action (unm G, i) 


# Update data structures for next step 
#u_nmi[:] = u_n; u_n[:] =u # safe, but slower 
u nmi, un, U= un, U, u pmi 


H Tima Loop -~ 
forintint reeek 
# Update all inner points 


if version == ’scalar’: 
forain e eE 
uli] = - u_nmi[i] + 2*u_n[i] + \ 


C2*(0.5*(q[i] + q[i+1])*(u_n[i+1] - u_n[i]) 


=N 


O EC + qli-1])*(u_n[i] - u_n[i-1])) + \ 


dt2*f(x[i], t[n]) 


elif version == ’vectorized’: 
uli:-1] = - u_nmi[1:-1] + 2*u_n[1:-1] + \ 


C2*(0.5*(q[1:-1] + q[2:])*(u_n[2:] - u_n[1:-1]) - 


0.5*(q[1:-1] + qf[:-2])*(u_n[4:-1] - u_n[:-2])) + \ 


dt2*f(x[1:-1], t[n]) 
else: 
raise ValueError(’version=/s’ % version) 


# Insert boundary conditions 
i = Ix[0] 
if U_O is None: 

# Set boundary values 


# x=0: i-1 -> i+1 since u[i-1]=u[it+1] when du/dn=0 
# x=L: i+1 -> i-1 since u[i+1]=u[i-1] when du/dn=0 


ipl = i+1 
imi = ipi 
uli] = - u_nmi[i] + 2*u_n[i] + \ 


C2*(0.5*(q[i] + qlip1])*(u_n[ipi] - u_n[i]) 


aN 


0.5*(q[i] + q[im1])*(u_n[i] - u_n[imi])) + \ 


dt2*f(x[i], t[n]) 
else: 
uli] = U_0(t[nt1]) 


a = eiei 
if U_L is None: 
ind = i=] 


ipl = imi 
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uli] = - u_nmi[i] + 2*u_n[i] + \ 
C2*(0.5*(qfli] + qlip1])*(u_n[ip1] - u_n[i]) - \ 
0.5*(q[i] + q{imi])*(u_n[i] - u_n[imi])) + \ 
dt2*f(x[i], t[n]) 
else: 
uli] = U_L(tIm+1]) 


if user_action is not None: 
if user_action(u, x, t, ntl): 
break 


# Update data structures for next step 
uomi, un, u = un, u, U nmi 


cpu_time = time.clock() - t0 
return cpu_time, hashed_input 


C.2 Saving Large Arrays in Files 


Numerical simulations produce large arrays as results and the software needs to 
store these arrays on disk. Several methods are available in Python. We recommend 
to use tailored solutions for large arrays and not standard file storage tools such as 
pickle (cPickle for speed in Python version 2) and shelve, because the tailored 
solutions have been optimized for array data and are hence much faster than the 
standard tools. 


C.2.1 Using savez to Store Arrays in Files 


Storing individual arrays The numpy.storez function can store a set of arrays 
to a named file in a zip archive. An associated function numpy. load can be used 
to read the file later. Basically, we call numpy.storez(filename, **kwargs), 
where kwargs is a dictionary containing array names as keys and the corresponding 
array objects as values. Very often, the solution at a time point is given a natural 
name where the name of the variable and the time level counter are combined, e.g., 
u11 or v39. Suppose n is the time level counter and we have two solution arrays, u 
and v, that we want to save to a zip archive. The appropriate code is 


import numpy as np 

u_name = ’ufj04d’ ø n # array name 

v_name = ’v/04d’ 4 n # array name 

kwargs = {u_name: u, v_name: v} # keyword args for savez 

fname = ’.mydata/04d.dat’ % n 

np.savez(fname, **kwargs) 

abe jal == Oe # store x once 
np.savez(’.mydata_x.dat’, x=x) 


Since the name of the array must be given as a keyword argument to savez, and the 
name must be constructed as shown, it becomes a little tricky to do the call, but with 
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a dictionary kwargs and **kwargs, which sends each key-value pair as individual 
keyword arguments, the task gets accomplished. 


Merging zip archives Each separate call to np.savez creates a new file (zip 
archive) with extension .npz. It is very convenient to collect all results in one 
archive instead. This can be done by merging all the individual .npz files into a 
single zip archive: 


def merge_zip_archives(individual_archives, archive_name): 
nun 
Merge individual zip archives made with numpy.savez into 
one archive with name archive_name. 
The individual archives can be given as a list of names 
or as a Unix wild chard filename expression for glob.glob. 
The result of this function is that all the individual 
archives are deleted and the new single archive made. 
nun 
import zipfile 
archive = zipfile.ZipFile( 
archive_name, ’w’, zipfile.ZIP_DEFLATED, 
allowZip64=True) 
if isinstance(individual_archives, (list,tuple)): 
filenames = individual_archives 
elif isinstance(individual_archives, str): 
filenames = glob.glob(individual_archives) 


# Open each archive and write to the common archive 
for filename in filenames: 
f = zipfile.ZipFile(filename, ’r’, 
zipfile.ZIP_DEFLATED) 
for name in f.namelist(): 
data = f.open(name, ’r’) 
# Save under name without .npy 
archive.writestr(name[:-4], data.read()) 
f£.close() 
os.remove(filename) 
archive.close() 


Here we remark that savez automatically adds the .npz extension to the names of 
the arrays we store. We do not want this extension in the final archive. 


Reading arrays from zip archives Archives created by savez or the merged 
archive we describe above with name of the form myarchive.npz, can be con- 
veniently read by the numpy. Load function: 


import numpy as np 
array_names = np.load(‘myarchive.npz‘) 
for array_name in array_names: 
# array_names[array_name] is the array itself 
# e.g. plot(array_names[’t’], array_names[array_name] ) 


C.2 Saving Large Arrays in Files 457 
C.2.2 Using joblib to Store Arrays in Files 


The Python package joblib has nice functionality for efficient storage of arrays on 
disk. The following class applies this functionality so that one can save an array, 
or in fact any Python data structure (e.g., a dictionary of arrays), to disk under a 
certain name. Later, we can retrieve the object by use of its name. The name of the 
directory under which the arrays are stored by joblib can be given by the user. 


class Storage (object): 
Store large data structures (e.g. numpy arrays) efficiently 
using joblib. 


Use: 


>>> from Storage import Storage 

>>> storage = Storage(cachedir=’tmp_u01’, verbose=1) 
>>> import numpy as np 

>>> a = np.linspace(0, 1, 100000) # large array 

>>> b = np.linspace(0, 1, 100000) # large array 

>>> storage.save(’a’, a) 

>>> storage.save(’b’, b) 

>>> # later 

>>> a = storage.retrieve(’a’) 

>>> b = storage.retrieve(’b’) 


def init__(self, cachedir=’tmp’, verbose=1): 


mon 


Parameters 
cachedir: str 
Name of directory where objects are stored in files. 
verbose: bool, int 
Let joblib and this class speak when storing files 
tordiski 
nnn 
import joblib 
self.memory = joblib.Memory(cachedir=cachedir, 
verbose=verbose) 
self.verbose = verbose 
self.retrieve = self.memory.cache( 
self.retrieve, ignore=[’data’]) 
self.save = self.retrieve 


def retrieve(self, name, data=None): 
if self.verbose > 0: 
print ’joblib save of’, name 
return data 


The retrive and save functions, which do the work, seem quite magic. The idea 
is that joblib looks at the name parameter and saves the return value data to disk 
if the name parameter has not been used in a previous call. Otherwise, if name is 
already registered, joblib fetches the data object from file and returns it (this is an 
example of a memoize function, see Section 2.1.4 in [11] for a brief explanation]). 
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C.2.3 Using a Hash to Create a File or Directory Name 


Array storage techniques like those outlined in Sect. C.2.2 and C.2.1 demand the 
user to assign a name for the file(s) or directory where the solution is to be stored. 
Ideally, this name should reflect parameters in the problem such that one can rec- 
ognize an already run simulation. One technique is to make a hash string out of the 
input data. A hash string is a 40-character long hexadecimal string that uniquely 
reflects another potentially much longer string. (You may be used to hash strings 
from the Git version control system: every committed version of the files in Git is 
recognized by a hash string.) 

Suppose you have some input data in the form of functions, numpy arrays, and 
other objects. To turn these input data into a string, we may grab the source code 
of the functions, use a very efficient hash method for potentially large arrays, and 
simply convert all other objects via str to a string representation. The final string, 
merging all input data, is then converted to an SHA1 hash string such that we rep- 
resent the input with a 40-character long string. 


def myfunction(funci, func2, arrayl, array2, obj1, obj2): 
# Convert arguments to hash 
import inspect, joblib, hashlib 
data = (inspect.getsource(func1) , 
inspect.getsource(func2) , 
joblib.hash(array1), 
joblib.hash(array2) , 
str(obj1), 
str (obj2)) 
hash_input = hashlib.shal (data) .hexdigest() 


It is wise to use joblib. hash and not try to do a str (array1), since that string 
can be very long, and joblib. hash is more efficient than hashlib when turning 
these data into a hash. 


Remark: turning function objects into their source code is unreliable! 
The idea of turning a function object into a string via its source code may look 
smart, but is not a completely reliable solution. Suppose we have some function 


x0 = 0.1 
f = lambda x: O if x <= x0 else i 


The source code will be f = lambda x: O if x <= x0 else 1, so if the 
calling code changes the value of xO (which f remembers - it is a closure), the 
source remains unchanged, the hash is the same, and the change in input data 
is unnoticed. Consequently, the technique above must be used with care. The 
user can always just remove the stored files in disk and thereby force a recom- 
putation (provided the software applies a hash to test if a zip archive or joblib 
subdirectory exists, and if so, avoids recomputation). 
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C.3 Software for the 1D Wave Equation 


We use numpy.storez to store the solution at each time level on disk. Such ac- 
tions must be taken care of outside the solver function, more precisely in the 
user_action function that is called at every time level. 

We have, in the wave1D_dn_vc.py code, implemented the user_action call- 
back function as a class PlotAndStoreSolutionwitha__call__(self, x, t, 
t, n) method for the user_action function. Basically, __call__ stores and 
plots the solution. The storage makes use of the numpy.savez function for sav- 
ing a set of arrays to a zip archive. Here, in this callback function, we want to 
save one array, u. Since there will be many such arrays, we introduce the array 
names ?uZ04d’ % n and closely related filenames. The usage of numpy . savez in 
_call__ goes like this: 


from numpy import savez 


name = ’u404d’ 4n # array name 
kwargs = {name: u} # keyword args for savez 
fname = ’.’ + self.filename + ’_’ + name + ’.dat’ 


self.t.append(t[n]) # store corresponding time value 
savez(fname, **kwargs) 
if n == 0: # store x once 

savez(’.’ + self.filename + ’_x.dat’, x=x) 


For example, if n is 10 and self.filename is tmp, the above call to savez 
becomes savez(’?.tmp_u0010.dat’, u0010=u). The actual filename becomes 
.tmp_u0010.dat.npz. The actual array name becomes 10010. npy. 

Each savez call results in a file, so after the simulation we have one file 
per time level. Each file produced by savez is a zip archive. It makes sense 
to merge all the files into one. This is done in the close_file method in the 
PlotAndStoreSolution class. The code goes as follows. 


class PlotAndStoreSolution: 


def close_file(self, hashed_input): 

woe 

Merge all files from savez calls into one archive. 

hashed_input is a string reflecting input data 

for this simulation (made by solver). 

won 

if self.filename is not None: 
# Save all the time points where solutions are saved 
savez(’.’ + self.filename + ’_t.dat’, 

t=array(self.t, dtype=float)) 

# Merge all savez files to one zip archive 
archive_name = ’.’ + hashed_input + ’_archive.npz’ 
filenames = glob.glob(’.’ + self.filename + ’*.dat.npz’) 
merge_zip_archives(filenames, archive_name) 


We use various ZipFile functionality to extract the content of the individual files 
(each with name filename) and write it to the merged archive (archive). There 
is only one array in each individual file (filename) so strictly speaking, there is 
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no need for the loop for name in f.namelist() (as f.namelist() returns a 
list of length 1). However, in other applications where we compute more arrays at 
each time level, savez will store all these and then there is need for iterating over 
f .namelist(). 

Instead of merging the archives written by savez we could make an alternative 
implementation that writes all our arrays into one archive. This is the subject of 
Exercise C.2. 


C.3.1 Making Hash Strings from Input Data 


The hashed_input argument, used to name the resulting archive file with all so- 
lutions, is supposed to be a hash reflecting all import parameters in the problem 
such that this simulation has a unique name. The hashed_input string is made 
in the solver function, using the hashlib and inspect modules, based on the 
arguments to solver: 


# Make hash of all input data 

import hashlib, inspect 

data = inspect.getsource(I) + ’_’ + inspect.getsource(V) + \ 
2a inspect i gotsource Ct) mcm sits (C) ican mua 
(’None’ if U_O is None else inspect.getsource(U_0)) + \ 
(’None’ if U_L is None else inspect.getsource(U_L)) + \ 
Ht Str) Go str (dt) ape A eh SstrCO ap 29 sp Str D EN 
?_?> + str(stability_safety_factor) 

hashed_input = hashlib.shail (data) .hexdigest () 


To get the source code of a function f as a string, we use inspect.get- 
source(f). All input, functions as well as variables, is then merged to a string 
data, and then hashlib.shai makes a unique, much shorter (40 characters long), 
fixed-length string out of data that we can use in the archive filename. 


Remark 

Note that the construction of the data string is not fool proof: if, e.g., I is a 
formula with parameters and the parameters change, the source code is still the 
same and data and hence the hash remains unaltered. The implementation must 
therefore be used with care! 


C.3.2 Avoiding Rerunning Previously Run Cases 


If the archive file whose name is based on hashed_input already exists, the sim- 
ulation with the current set of parameters has been done before and one can avoid 
redoing the work. The solver function returns the CPU time and hashed_input, 
and a negative CPU time means that no simulation was run. In that case we should 
not call the close_file method above (otherwise we overwrite the archive with 
just the self .t array). The typical usage goes like 
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action = PlotAndStoreSolution(...) 
dt = (L/Nx)/C # choose the stability limit with given Nx 
cpu, hashed_input = solver( 
I=lambda x: ..., 
V=0, f=0, c=1, U_O=lambda t: 0, U_L=None, L=1, 
dt=dt, C=C, T=T, 
user_action=action, version=’vectorized’, 
stability_safety_factor=1) 
action.make_movie_file() 
if cpu > 0: # did we generate new data? 
action.close_file(hashed_input) 


C.3.3 Verification 


Vanishing approximation error Exact solutions of the numerical equations are 
always attractive for verification purposes since the software should reproduce such 
solutions to machine precision. With Dirichlet boundary conditions we can con- 
struct a function that is linear in ¢ and quadratic in x that is also an exact solution of 
the scheme, while with Neumann conditions we are left with testing just a constant 
solution (see comments in Sect. 2.6.5). 


Convergence rates A more general method for verification is to check the conver- 
gence rates. We must introduce one discretization parameter and assume an error 
model E = Ch’, where C and r are constants to be determine (i.e., r is the rate 
that we are interested in). Given two experiments with different resolutions h; and 
h;—1, we can estimate r by 


E ln(E;/Ei—1) 
os In(h;/hj-1) 


where ŒE; is the error corresponding to h; and E;_; corresponds to h;_;. Sec- 
tion 2.2.2 explains the details of this type of verification and how we introduce 
the single discretization parameter h = At = CAt, for some constant ¢. To com- 
pute the error, we had to rely on a global variable in the user action function. Below 
is an implementation where we have a more elegant solution in terms of a class: the 
error variable is not a class attribute and there is no need for a global error (which 
is always considered an advantage). 


def convergence_rates( 
u_exact, 
i, Ve ara cy UO, UL. 
dtO, num_meshes, 
C, T, version=’scalar’, 
stability_safety_factor=1.0): 
"un 
Half the time step and estimate convergence rates for 
for num_meshes simulations. 
woe 
class ComputeError: 
def __init__(self, norm_type): 
self.error = 0 
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def _ call  (sƏlf, u, x, t, m): 
"""Store norm of the error in self.E.""" 
error = np.abs(u - u_exact(x, t[n])).max(Q) 


self.error = max(self.error, error) 


w = o] 
h = [] # dt, solver adjusts dx such that C=dt*c/dx 
dt = dt0 


for i in range(num_meshes) : 
error_calculator = ComputeError(’Linf’) 
solver, Wy Sey Cy WO, WIL, L Chis, C E 
user_action=error_calculator, 
version=’scalar’, 
stability_safety_factor=1.0) 
E.append(error_calculator.error) 
h. append (dt) 
dt /= 2 # halve the time step for next simulation 
print ’E:’, E 
print ’h:’, h 
r = [np.log(E[i] /E[i-1]) /np.log(h[i] /h[i-1]) 
for i in range(1,num_meshes) ] 
return r 


The returned sequence r should converge to 2 since the error analysis in Sect. 2.10 
predicts various error measures to behave like O(Ar?) + O(Ax’). We can 
easily run the case with standing waves and the analytical solution u(x,t) = 
cos(#1) sin(x). The call will be very similar to the one provided in the 
test_convrate_sincos function in Sect. 2.3.4, see the file wave1D_dn_vc.py 
for details. 


C.4 Programming the Solver with Classes 


Many who know about class programming prefer to organize their software in terms 
of classes. This gives a richer application programming interface (API) since a func- 
tion solver must have all its input data in terms of arguments, while a class-based 
solver naturally has a mix of method arguments and user-supplied methods. (Well, 
to be more precise, our solvers have demanded user_action to be a function pro- 
vided by the user, so it is possible to mix variables and functions in the input also 
with a solver function.) 

We will next illustrate how some of the functionality in wave1D_dn_vc.py may 
be implemented by using classes. Focusing on class implementation aspects, we re- 
strict the example case to a simpler wave with constant wave speed c. Applying the 
method of manufactured solutions, we test whether the class based implementation 
is able to compute the known exact solution within machine precision. 

We will create a class Problem to hold the physical parameters of the problem 
and a class Solver to hold the numerical solution parameters besides the solver 
function itself. As the number of parameters increases, so does the amount of 
repetitive code. We therefore take the opportunity to illustrate how this may be 
counteracted by introducing a super class Parameters that allows code to be pa- 
rameterized. In addition, it is convenient to collect the arrays that describe the mesh 


C.4 Programming the Solver with Classes 463 


in a special Mesh class and make a class Function for a mesh function (mesh point 
values and its mesh). All the following code is found in wave1D_oo.py. 


C.4.1 Class Parameters 


The classes Problem and Solver both inherit class Parameters, which handles 
reading of parameters from the command line and has methods for setting and 
getting parameter values. Since processing dictionaries is easier than process- 
ing a collection of individual attributes, the class Parameters requires each class 
Problem and Solver to represent their parameters by dictionaries, one compul- 
sory and two optional ones. The compulsory dictionary, self .prm, contains all 
parameters, while a second and optional dictionary, self .type, holds the asso- 
ciated object types, and a third and optional dictionary, self .help, stores help 
strings. The Parameters class may be implemented as follows: 


class Parameters (object): 
def init__(self): 


Subclasses must initialize self.prm with 

parameters and default values, self.type with 

the corresponding types, and self.help with 

the corresponding descriptions of parameters. 
self.type and self.help are optional, but 

self.prms must be complete and contain all parameters. 


pass 


def ok(self): 
"""Check if attr. prm, type, and help are defined.""" 
if hasattr(self, ’prm’) and \ 
isinstance(self.prm, dict) and \ 
hasattr(self, ’type’) and \ 
isinstance(self.type, dict) and \ 
hasattr (self, ’help’) and \ 
isinstance(self.help, dict): 
return True 
else: 
raise ValueError( 
>The constructor in class %s does not ’\ 
’ initialize the\ndictionaries ’\ 
?self.prm, self.type, self.help!’ % 
self.__class__.__name__) 
def _illegal_parameter(self, name): 
"""Raise exception about illegal parameter name.""" 
raise ValueError( 
’ parameter "4s" is not registered.\nLegal ’\ 
’ parameters are\njs’ % 
(name, ? ’.join(list(self.prm.keys())))) 
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def set(self, **parameters): 
uu"Set one or more parameters.""" 
for name in parameters: 
if name in self.prm: 
self.prm[name] = parameters [name] 
else: 
self._illegal_parameter (name) 


def get(self, name): 
"""Get one or more parameter values.""" 
if isinstance(nmame, (list,tuple)): # get many? 
for n in name: 
if n not in self.prm: 
self._illegal_parameter (name) 
return [self.prm[n] for n in name] 
else: 
if name not in self.prm: 
self._illegal_parameter (name) 
return self.prm[name] 


def __getitem__(self, name): 
"""Allow obj[name] indexing to look up a parameter.""" 
return self.get (name) 


def setitem__(self, name, value): 


mon 


Allow obj[mame] = value syntax to assign a parameter’s value. 
nnn 


return self.set (name=value) 


def define_command_line_options(self, parser=None) : 
self .ok() 
if parser is None: 
import argparse 
parser = argparse.ArgumentParser () 


for name in self.prm: 
tp = self.type[name] if name in self.type else str 
help = self.help[mame] if name in self.help else None 
parser .add_argument ( 
»--? + name, default=self.get(name), metavar=name, 
type=tp, help=help) 


return parser 
def init_from_command_line(self, args): 


for name in self.prm: 
self.prm[mame] = getattr (args, name) 
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C.4.2 Class Problem 


Inheriting the Parameters class, our class Problem is defined as: 


class Problem(Parameters) : 
wee 
Physical parameters for the wave equation 
u_tt = (c**2*#u_x)_x + f£(x,t) with t in [0,T] and 
x in (0,L). The problem definition is implied by 
the method of manufactured solution, choosing 
u(x,t)=x(L-x) (1+t/2) as our solution. This solution 


should be exactly reproduced when c is const. 
wee 


def _init__ (self): 
self.prm = dict(L=2.5, c=1.5, T=18) 
self.type = dict(L=float, c=float, T=float) 
self.help = dict(L=’1D domain’, 
c=’coefficient (wave velocity) in PDE’, 
T=’end time of simulation’) 
def u_exact(self, x, t): 
L = self[’L’] 
return x*(L-x)*(1+0.5*t) 
def I(self, x): 
return self.u_exact(x, 0) 
def V(self, x): 
return 0.5*self.u_exact(x, 0) 
def f(self, x, t): 
c = self[’c’] 
return 2*(1+0.5*t)*c**2 
def U_O(self, t): 
return self.u_exact(0, t) 
U_L = None 


C.4.3 Class Mesh 


The Mesh class can be made valid for a space-time mesh in any number of space 
dimensions. To make the class versatile, the constructor accepts either a tuple/list of 
number of cells in each spatial dimension or a tuple/list of cell spacings. In addition, 
we need the size of the hypercube mesh as a tuple/list of 2-tuples with lower and 
upper limits of the mesh coordinates in each direction. For 1D meshes it is more 
natural to just write the number of cells or the cell size and not wrap it in a list. We 
also need the time interval from tO to T. Giving no spatial discretization information 
implies a time mesh only, and vice versa. The Mesh class with documentation and 
a doc test should now be self-explanatory: 
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import numpy as np 


class Mesh(object): 


Holds data structures for a uniform mesh on a hypercube in 
space, plus a uniform mesh in time. 


Argument Explanation 
L List of 2-lists of min and max coordinates 
in each spatial direction. 
al Final time in time mesh. 
Nt Number of cells in time mesh. 
dt Time step. Either Nt or dt must be given. 
N List of number of cells in the spatial directions. 
d List of cell sizes in the spatial directions. 


Either N or d must be given. 


Users can access all the parameters mentioned above, plus 
fon, TEOR: 


‘*xfi]‘‘ and ‘‘t‘* for the coordinates in direction ‘‘i 
and the time coordinates, respectively. 


Examples: 


>>> from UniformFDMesh import Mesh 

>>> 

>>> # Simple space mesh 

>>> m = Mesh(L=[0,1], N=4) 

>>> print m.dump() 

space: [0,1] N=4 d=0.25 

>>> 

>>> # Simple time mesh 

>>> m = Mesh(T=4, dt=0.5) 

>>> print m.dump() 

time: [0,4] Nt=8 dt=0.5 

>>> 

>>> # 2D space mesh 

ooom = Mesh(L=[\0,41)> [=15 11), d=[0-5; 11) 
>>> print m.dump() 

space: [0,1]x[-1,1] N=2x2 d=0.5,1 

>>> 

>>> # 2D space mesh and time mesh 

>>> m = Mesh(L=[[0,1]|, [=1,1], d=[0.5, 1], Nt=10, T=3) 
>>> print m.dump() 

space: [0,1]x[-1,1] N=2x2 d=0.5,1 time: [0,3] Nt=10 dt=0.3 


det niie Oii 
L=None, T=None, t0=0, 
N=None, d=None, 
Nt=None, dt=None): 
if N is None and d is None: 
# No spatial mesh 
if Nt is None and dt is None: 
raise ValueError( 
Mesh constructor: either Nt or dt must be given’) 
if T is None: 
raise ValueError( 
’Mesh constructor: T must be given’) 
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if Nt is None and dt is None: 
if N is None and d is None: 
raise ValueError ( 
’Mesh constructor: either N or d must be given’) 
if L is None: 
raise ValueError ( 
?Mesh constructor: L must be given’) 


# Allow 1D interface without nested lists with one element 
if L is not None and isinstance(L[0], (float,int)): 
# Only an interval was given 


L = [L] 

if N is not None and isinstance(N, (float,int)): 
N = [N] 

if d is not None and isinstance(d, (float,int)): 
d = [a] 


# Set all attributes to None 
self.x = None 

self.t = None 

self.Nt = None 

self.dt = None 

self.N = None 

self.d = None 

self.t0 = tO 


if N is None and d is not None and L is not None: 
self.L = L 
if len(d) != len(L): 
raise ValueError ( 
?d has different size (no of space dim.) from ’ 
*L: %d vs 4d’, len(d), len(L)) 
self.d=d 
self .N [int (round(float(self.L[i] [1] - 
self.L[i] [0])/d[i])) 
for i in range(len(d))] 
if d is None and N is not None and L is not None: 
self.L =L 
if len(N) != len(L): 
raise ValueError ( 
?N has different size (no of space dim.) from ’ 
*L: Zid vs 4d’, len(N), len(L)) 
self .N=N 
self.d = [float(self.L[i][1] - self.L[i] [0])/N[i] 
for i in range(len(N))] 


if Nt is None and dt is not None and T is not None: 
self.T = T 
self.dt = dt 
self .Nt = int(round(T/dt)) 

if dt is None and Nt is not None and T is not None: 
self.T = T 
self.Nt = Nt 
self.dt = T/float (Nt) 


if self.N is not None: 
self.x = [np.linspace( 
self.L[i] [0], self.L[i] [1], self.N[i]+1) 
for i in range(len(self.L))] 
if Nt is not None: 
self.t = np.linspace(self.t0, self.T, self.Nt+1) 
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def get_num_space_dim(self): 
return len(self.d) if self.d is not None else 0 


def has_space(self): 
return self.d is not None 


def has_time(self): 
return self.dt is not None 


def dump(self): 
s=”? 
if self.has_space(): 
s t= ’space: ’ + \ 
>x? join([’ [%e.%e]’ % (self.L[i] [0], self.L[i] [1]) 
for i in range(len(self.L))]) + ° N=’ 
s += ’x’.join([str(Ni) for Ni in self.N]) + ’ d=’ 


s += ’,’.join([str(di) for di in self.d]) 
if self.has_space() and self.has_time(): 
mg cs DO 


if self.has_time(): 
E aa ag at lion ple (sellite COmmcellet oly) a N 
> Nt=/g’ % self.Nt + °? dt=/g’ % self.dt 
return s 


We rely on attribute access — not get/set functions! 
Java programmers, in particular, are used to get/set functions in classes to access 
internal data. In Python, we usually apply direct access of the attribute, such as 
m.N[i] if mis a Mesh object. A widely used convention is to do this as long as 
access to an attribute does not require additional code. In that case, one applies 
a property construction. The original interface remains the same after a property 
is introduced (in contrast to Java), so user will not notice a change to properties. 
The only argument against direct attribute access in class Mesh is that the 
attributes are read-only so we could avoid offering a set function. Instead, we 
rely on the user that she does not assign new values to the attributes. 


C.4.4 Class Function 


A class Function is handy to hold a mesh and corresponding values for a scalar 
or vector function over the mesh. Since we may have a time or space mesh, or a 
combined time and space mesh, with one or more components in the function, some 
if tests are needed for allocating the right array sizes. To help the user, an indices 
attribute with the name of the indices in the final array u for the function values is 
made. The examples in the doc string should explain the functionality. 
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class Function(object): 


A scalar or vector function over a mesh (of class Mesh). 


mesh Class Mesh object: spatial and/or temporal mesh. 

num_comp Number of components in function (1 for scalar). 

space_only True if the function is defined on the space mesh 
only (to save space). False if function has values 
in space and time. 


The indexing of ‘‘u‘‘, which holds the mesh point values of the 


function, depends on whether we have a space and/or time mesh. 
Examples: 


>>> from UniformFDMesh import Mesh, Function 
>>> 

>>> # Simple space mesh 

>>> m = Mesh(L=[0,1], N=4) 

>>> print m.dump() 

space: [0,1] N=4 d=0.25 

>>> f = Function(m) 

>>> f.indices 

[?x07] 

>>> f.u.shape 

(5,) 

>>> f.u[4] # space point 4 

0.0 

>>> 

>>> # Simple time mesh for two components 
>>> m = Mesh(T=4, dt=0.5) 

>>> print m.dump() 

time: [0,4] Nt=8 dt=0.5 

>>> f = Function(m, num_comp=2) 

>>> f.indices 


[’ time’, ’component’] 

>>> f.u.shape 

@, ») 

>>> f.u[3,1] # time point 3, comp=1 (2nd comp.) 
0.0 

>>> 


>>> # 2D space mesh 

>>> m = Mesh(L=[[0,1], [-1,1]], d=[0.5, 1]) 
>>> print m.dump() 

space: [0,1]x[-1,1] N=2x2 d=0.5,1 

>>> f = Function(m) 

>>> f.indices 


[?x0’ é 2x17] 
>>> f.u.shape 
(3, 3) 


>>> f.u[1,2] # space point (1,2) 
0.0 
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>>> 

>>> # 2D space mesh and time mesh 

>>> m = Mesh(L=[[0, 1], [-1,1]], d=0.5,1], Nt=10, T=3) 

>>> print m.dump() 

space: [0,1]x[-1,1] N=2x2 d=0.5,1 time: [0,3] Nt=10 dt=0.3 
>>> f = Function(m, num_comp=2, space_only=False) 

>>> f -Indices 

[’?time’, ’x0’, ’x1’, ’component’] 

>>> f.u.shape 

(Gils eis, 3 2) 

>>> £.u[2,1,2,0] # time step 2, space point (1,2), comp=0 
0.0 

>>> # Function with space data only 

>>> f = Function(m, num_comp=1, space_only=True) 

>>> f Indices 

Los ?x1?] 

>>> f.u.shape 

(i, SD) 

>>> f.u[1,2] # space point (1,2) 

0.0 


mun 


def __init__(self, mesh, num_comp=1, space_only=True): 
self.mesh = mesh 
self .num_comp = num_comp 
self.indices = [] 


# Create array(s) to store mesh point values 
if (self.mesh.has_space() and not self.mesh.has_time()) or \ 
(self.mesh.has_space() and self.mesh.has_time() and \ 
space_only): 
# Space mesh only 
if num_comp == 
self.u = np.zeros( 
[self .mesh.N[i] + 1 
for i in range(len(self.mesh.N))]) 
self.indices = [ 
>x’+str(i) for i in range(len(self.mesh.N))] 
else: 
self.u = np.zeros( 
[self .mesh.N[i] + 1 
for i in range(len(self.mesh.N))] + 
[num_comp] ) 
self.indices = [ 
ESTEC 
for i in range(len(self.mesh.N))] +\ 
[’? component’ ] 
if not self.mesh.has_space() and self.mesh.has_time(): 
# Time mesh only 
if num_comp == 
self.u = np.zeros(self.mesh.Nt+1) 
self.indices = [’time’] 
else: 
# Need num_comp entries per time step 
self.u = np.zeros((self.mesh.Nt+1, num_comp) ) 
self.indices = [’time’, ’component’] 
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if self.mesh.has_space() and self.mesh.has_time() \ 
and not space_only: 
# Space-time mesh 
size = [self.mesh.Nt+1] + \ 
[self .mesh.N[i]+1 
for i in range(len(self.mesh.N))] 
if num_comp > 1: 
self.indices = [’time’] + \ 
[?x’+str(i) 
for i in range(len(self.mesh.N))] +\ 
[’? component’ ] 
size += [num_comp] 
else: 
self.indices = [’time’] + [’x’+str(i) 
for i in range(len(self.mesh.N))] 
self.u = np.zeros(size) 


C.4.5 Class Solver 


With the Mesh and Function classes in place, we can rewrite the solver function, 
but we make it a method in class Solver: 


class Solver (Parameters): 
LELEL 
Numerical parameters for solving the wave equation 
u_tt = (c**2*u_x)_x + f(x,t) with t in [0,T] and 
x in (0,L). The problem definition is implied by 
the method of manufactured solution, choosing 
u(x,t)=x(L-x) (1+t/2) as our solution. This solution 
should be exactly reproduced, provided c is const. 
We simulate in [0, L/2] and apply a symmetry condition 
at the end x=L/2. 


def __init__(self, problem): 
self.problem = problem 
self.prm = dict(C = 0.75, Nx=3, stability_safety_factor=1.0) 
self.type = dict(C=float, Nx=int, stability_safety_factor=float) 
self .help = dict(C=’Courant number’, 
Nx=’No of spatial mesh points’, 
stability_safety_factor=’stability factor’) 


from UniformFDMesh import Mesh, Function 
# introduce some local help variables to ease reading 
L_end = self.problem[’L’] 
dx = (L_end/2)/float (self [’Nx’]) 
t_interval = self.problem[’T’] 
dt = dx*self[’stability_safety_factor’]*self[’C’]/ \ 
float (self .problem[’c’]) 
self.m = Mesh(L=[0,L_end/2], 
d=[dx] , 
Nt = int (round(t_interval/float(dt))), 
T=t_interval) 
# The mesh function f will, after solving, contain 
# the solution for the whole domain and all time steps. 
self.f = Function(self.m, num_comp=1, space_only=False) 
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def solve(self, user_action=None, version=’scalar’): 


# ...use local variables to ease reading 
ca = selt.problem (ince. splat O] 
L = L/2 # compute with half the domain only (symmetry) 


C, Nx, stability_safety_factor = self[ 
?C Nx stability_safety_factor’.split()] 
dx = self.m.d[0] 
= self.problem.I 
= self.problem.V 
self.problem.f 


0 = self.problem.U_0O 

L = self.problem.U_L 
Nt = self.m.Nt 
t = np.linspace(0, T, Nt+1) # Mesh points in time 
x = np.linspace(0, L, Nx+1) # Mesh points in space 


# Make sure dx and dt are compatible with x and t 
dx = xH = xi0] 
dt = t[1] - t[0] 


# Treat c(x) as array 
if isinstance(c, (float,int)): 
c = np.zeros(x.shape) + c 
elif callable(c): 
# Call c(x) and fill array c 
c_ = np.zeros(x.shape) 
for i in range(Nxt+1): 


c_fi] = c(xfi]) 


q= CKD 
C2 = (dt/dx)**2; dt2 = dt*dt # Help variables in the scheme 


# Wrap user-given f, I, V, U_O, U L if None or 0 


if f is None or f == 0: 
f = (lambda x, t: 0) if version == ’scalar’ else \ 
lambda x, t: np.zeros(x.shape) 
if I is None or I == 0: 
I = (lambda x: 0) if version == ’scalar’ else \ 


lambda x: np.zeros(x.shape) 
if V is None or V == 0: 
V = (lambda x: 0) if version == ’scalar’ else \ 
lambda x: np.zeros(x.shape) 
if U_O is not None: 
if isinstance(U_O, (float,int)) and U_O == 0: 
U_O = lambda t: 0 
if U_L is not None: 
if isinstance(U_L, (float,int)) and U_L == 0: 
U_L = lambda t: 0 


# Make hash of all input data 

import hashlib, inspect 

data = inspect.getsource(I) + ’_’ + inspect.getsource(V) + \ 
Dap inspect geteounco Ge) ar 47 ap fee) se 22 ae \\ 
(’None’ if U_O is None else inspect.getsource(U_0)) + \ 
(’None’ if ULL is None else inspect.getsource(U_L)) + \ 
YW eS there) Etr A se J) Str O an 2 9 ce epee) a> \\ 
? ? + str(stability_safety_factor) 
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hashed_input = hashlib.shail (data) .hexdigest () 

if os.path.isfile(’.’ + hashed_input + ’_archive.npz’): 
# Simulation is already run 
return -1, hashed_input 


# use local variables to make code closer to mathematical 
# notation in computational scheme 

u i = self.f.u[0,:] 

u = self.f.ul[i,:] 


import time; tO = time.clock() # CPU time measurement 


Iz 
It 


range(0, Nx+1) 
range(0, Nt+1) 


# Load initial condition into u_1 
for i in range(0,Nxt1): 
u_1[i] = I¢x[il]) 


if user_action is not None: 
user_action(u_i1, x, t, 0) 


# Special formula for the first step 
for i in ixi 
uli] = u_1[i] + dt*V(x[i]) + \ 
0.5*C2*(0.5*(q[i] + qli+1])*(u_1[i+1] - u_1[i]) - \ 
0.5*(q[i] + q[i-1])*(u_1[i] - u_1[i-1])) + \ 
0.5*dt2*f(x[i], t[0]) 


i = Ix[0] 
if U_O is None: 
# Set boundary values (x=0: i-1 -> i+1 since u[i-1]=u[i+1] 
# when du/dn = 0, on x=L: i+1 -> i-1 since u[i+1]=u[i-1]) 
ipl = i+1 
imi = api # dsl > iti 
uli] = u_1[i] + dt*V(x[i]) + \ 
0.5*C2*(0.5*(q[i] + qlip1])*(u_1[ip1] - u_1[i]) - \ 
0.5*(q[i] + qlimt])*(u_1[i] - u_1[im1])) + \ 
0.5*dt2*f(x[i], t[0]) 
else: 
uli] = U_O(dt) 


2 e aiel] 
if UL is None: 
imi = i-1 


ipl = ami a > a 
uli] = u_i[i] + dt*V(x[i]) + \ 
0.5*C2*(0.5*(q[i] + qlipi])*(u_1[ip1] - u_ili]) - \ 
0.5*(q[i] + q[im1])*(u_1[i] - u_1[im1])) + \ 
0.5*dt2*f(x[i], t[0]) 
else: 
uli] = U_L(dt) 


if user_action is not None: 
user_action(u, x, t, 1) 
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for n in It[1:-1]: 
# u corresponds to u^{n+1} in the mathematical scheme 
u_2 = self.f.ul[n-1,:] 
bial self.f.uln,:] 
u self.f.u[m+1,:] 


# Update all inner points 


if version == ’scalar’: 
aog al sti W<|filesahl| 2 
uli] = - u_2[i] + 2*u_1[i] + \ 


C2*(0.5*(q[i] + q[i+1])*(u_1[i+1] - u_ifi]) - \ 
0.5*(q[i] + q[i-1])*(u_1[i] - u_1[i-1])) + \ 
dt2*f(x[i], t[n]) 


elif version == ’vectorized’: 
ul1:-1] = - u_2[1:-1] + 2*u_i[1:-1] + \ 
CAAO Sale| < els eaGn ale) = wi alibleallp) = 
0.5*(q[1:-1] + g[:-2])*(u_1[1:-1] - u_1[:-2])) + \ 
dt2*f(x[1:-1], t[n]) 
else: 
raise ValueError(’version=/s’ % version) 


# Insert boundary conditions 

i = Ix[0] 

if U_O is None: 
# Set boundary values 
# x=0: i-1 -> i+1 since uf[i-1]=u[it1i] when du/dn=0 
# x=L: i+1 -> i-1 since u[i+1]=u[i-1] when du/dn=0 


ipi = i+1 
imi = ipi 
uli] = - u_2[i] + 24u_ifi] + \ 


C2*(0.5*(q[i] + qlipi])*(u_i[ip1] - u_ifi]) - \ 
0.5*(q[i] + qlimi])*(u_1[i] - u_1[im1])) + \ 
dt2*f(x[i], t[n]) 
else: 
uli] = U_O(tIm+1]) 


i = Ix[-1] 
if U_L is None: 
imi = i-1 
ipl = imi 
uli] = - u_2[i] + 2*u_i[i] + \ 


C2*(0.5*(q[i] + qlipt])*(u_i[ip1] - u_i[i]) - \ 
0.5*(q[i] + q{imi])*(u_1[i] - u_1[im1])) + \ 
dt2*f(x[i], t[n]) 
else: 
uli] = U_L(t[n+1]) 


if user_action is not None: 
if user_action(u, x, t, n+1): 
break 


cpu_time = time.clock() - tO 
return cpu_time, hashed_input 
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def assert_no_error(self): 
"""Run through mesh and check error""" 
Nx = self[’Nx’] 


Nt = self.m.Nt 

L, T = self.problem[’L T’.split()] 

L = L/2 # only half the domain used (symmetry) 

x = np.linspace(0, L, Nx+1) # Mesh points in space 
t = np.linspace(0, T, Nt+1) # Mesh points in time 


for n in range(len(t)): 
u_e = self.problem.u_exact(x, t[n]) 
diff = np.abs(self.f.u[n,:] - u_e).max() 
print diff: diff 
tol = 1E-13 
assert diff < tol 


Observe that the solutions from all time steps are stored in the mesh function, 
which allows error assessment (in assert_no_error)to take place after all solu- 
tions have been found. Of course, in 2D or 3D, such a strategy may place too high 
demands on available computer memory, in which case intermediate results could 
be stored on file. 

Running wave1D_oo.py gives a printout showing that the class-based imple- 
mentation performs as expected, i.e. that the known exact solution is reproduced 
(within machine precision). 


C.5 Migrating Loops to Cython 


We now consider the wave2D_u0.py code for solving the 2D linear wave equa- 
tion with constant wave velocity and homogeneous Dirichlet boundary conditions 
u = 0. We shall in the present chapter extend this code with computational 
modules written in other languages than Python. This extended version is called 
wave2D_u0_adv.py. 

The wave2D_u0. py file contains a solver function, which calls an advance_* 
function to advance the numerical scheme one level forward in time. The func- 
tion advance_scalar applies standard Python loops to implement the scheme, 
while advance_vectorized performs corresponding vectorized arithmetics with 
array slices. The statements of this solver are explained in Sect. 2.12, in particular 
Sect. 2.12.1 and 2.12.2. 

Although vectorization can bring down the CPU time dramatically compared 
with scalar code, there is still some factor 5-10 to win in these types of applications 
by implementing the finite difference scheme in compiled code, typically in Fortran, 
C, or C++. This can quite easily be done by adding a little extra code to our program. 
Cython is an extension of Python that offers the easiest way to nail our Python loops 
in the scalar code down to machine code and achieve the efficiency of C. 

Cython can be viewed as an extended Python language where variables are 
declared with types and where functions are marked to be implemented in C. Mi- 
grating Python code to Cython is done by copying the desired code segments to 
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functions (or classes) and placing them in one or more separate files with extension 
. pyx. 


C.5.1 Declaring Variables and Annotating the Code 


Our starting point is the plain advance_scalar function for a scalar implementa- 
tion of the updating algorithm for new values u”*!: 


ij 


def advance™scallar(u, uin u nmi fj, yt, n, Cx2, Cy2, dt2), 
V=None, stepi=False): 
Ix = range(0, u.shape[0]); Iy = range(0, u.shape[1]) 
if step1: 
dt = sqrt(dt2) # save 
Cx2 = 0.5*Cx2; Cy2 = 0.5*Cy2; dt2 = 0.5*dt2 # redefine 
Di=1; D2=0 
else: 
Di=2; D2=1 
for iint ES 
forj any Lyi: 
Ul sse S hon|fisl, yl) = AEM salieh Gl] ap eon bineal  9| 
Tya = alpa = AE sale al] ae salt alana] 
uli,j] = Di*u_n[i,j] - D2*u_nmi[i,j] + \ 
Cx2*u_xx + Cy2*u_yy + dt2*f(x[i], y[j], t[n]) 
if stepi: 
uli,j] += dt+*V(xfi], y[j]) 
# Boundary condition u=0 
j = Iy[0] 
for i ine Iz: winyl 
j = Iy[-1] 
for ikin x ane] 
i = Ix[0] 
for j in Iy: uli,j] = 
i = Ix[-1] 
for j in Iy: uli,j] = 
return u 


0 


0 


| 
fo} 


| 
fo} 


We simply take a copy of this function and put it in a file wave2D_u0_loop_cy. 
pyx. The relevant Cython implementation arises from declaring variables with 
types and adding some important annotations to speed up array computing in 
Cython. Let us first list the complete code in the . pyx file: 


import numpy as np 

cimport numpy as np 

cimport cython 

ctypedef np.float64_t DT # data type 


@cython.boundscheck(False) # turn off array bounds check 
@cython.wraparound(False) # turn off negative indices (u[-1,-1]) 
cpdef advance ( 

np.ndarray[DT, ndim=2, mode=’c’] u, 

np.ndarray[DT, ndim=2, mode=’c’] u_n, 

np.ndarray[DT, ndim=2, mode=’c’] u_nmi, 

np.ndarray[DT, ndim=2, mode=’c’] f, 

double Cx2, double Cy2, double dt2): 
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cdef: 
inte Iz start = 0 
int Iy_start = 0 
int Ix_end = u.shape[0]-1 
int Iy_end = u.shape[1]-1 
int i, j 


double u_xx, u_yy 


for i in range(Ix_start+1, Ix_end): 
for j in range(Iy_start+1, Iy_end): 
Uexxe— endl tena) lect ue fact 
u_yy = u_n[i,j-1] - 2*u_n[i,j] + u_n[i,j+1] 
Wt] S Pearsall = e a l sb N 
Cx2*u_xx + Cy2*u_yy + dt2*f [i,j] 
# Boundary condition u=0 
j = Iy_start 


for i in range(Ix_start, Ix_end+1): uli,j] = 0 
Je- r lysend 

for i in range(Ix_start, Ix_end+1): uli,j] = 0 
i = Ix_start 

for j in range(Iy_start, Iy_end+1): uli,j] = 0 
i = Ix_end 

for j in range(Iy_start, Iy_end+1): uli,j] = 0 


return u 


This example may act as a recipe on how to transform array-intensive code with 
loops into Cython. 


1. Variables are declared with types: for example, double v in the argument list 
instead of just v, and cdef double v for a variable v in the body of the func- 
tion. A Python float object is declared as double for translation to C by 
Cython, while an int object is declared by int. 

2. Arrays need a comprehensive type declaration involving 
e the type np.ndarray, 

e the data type of the elements, here 64-bit floats, abbreviated as DT through 
ctypedef np.float64_t DT (instead of DT we could use the full name of 
the data type: np. float64_t, which is a Cython-defined type), 

e the dimensions of the array, here ndim=2 and ndim=1, 

e specification of contiguous memory for the array (mode=’c’). 

3. Functions declared with cpdef are translated to C but are also accessible from 
Python. 

4. In addition to the standard numpy import we also need a special Cython import 
of numpy: cimport numpy as np, to appear after the standard import. 

5. By default, array indices are checked to be within their legal limits. To speed 
up the code one should turn off this feature for a specific function by placing 
@cython.boundscheck (False) above the function header. 

6. Also by default, array indices can be negative (counting from the end), but this 
feature has a performance penalty and is therefore here turned off by writing 
@cython. wraparound (False) right above the function header. 

7. The use of index sets Ix and Ly in the scalar code cannot be successfully trans- 
lated to C. One reason is that constructions like Ix[1:-1] involve negative 
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indices, and these are now turned off. Another reason is that Cython loops 
must take the form for i in xrange or for i in range for being trans- 
lated into efficient C loops. We have therefore introduced Ix_start as Ix [0] 
and Ix_end as Ix[-1] to hold the start and end of the values of index i. Similar 
variables are introduced for the j index. A loop for i in Ix is with these new 
variables written as for i in range(Ix_start, Ix_end+1). 


Array declaration syntax in Cython 

We have used the syntax np.ndarray[DT, ndim=2, mode=’c’] to declare 
numpy arrays in Cython. There is a simpler, alternative syntax, employing typed 
memory views!, where the declaration looks like double [:, :]. However, the 
full support for this functionality is not yet ready, and in this text we use the full 
array declaration syntax. 


C.5.2 Visual Inspection of the C Translation 


Cython can visually explain how successfully it translated a code from Python to C. 
The command 


Terminal 


Terminal> cython -a wave2D_u0_loop_cy.pyx 


produces an HTML file wave2D_u0_loop_cy.htm1, which can be loaded into a 
web browser to illustrate which lines of the code that have been translated to C. 
Figure C.1 shows the illustrated code. Yellow lines indicate the lines that Cython 


2: cisport numpy as np 

3: cimport cython 

4: ctypedef np. floats t OT # data type 
5 


6: @ython.boundscheck (False) # turn off array bounds check 


7: thon. wraparound (False) æ turn off negative indices (ul-1,-1}) 
9 np.ndarray[DT, ndim=2, aode='c'] u 

10: np.ndarray{OT, ndim=2, modee'c'] u_l. 

ll: np.ndarraylOT, ndie-2. mode='c'] u 2. 

12: np.ndarray(OT, ndim=2. mode=‘c") f, 

13: double Cx2, double Cy2, double dt2) 

l4: 

15: cdef int Ix_start = O 

16 cdef int ly_start = 0 

17: cdef int Ix_end = u.shape(O]-1 

18: edef int Iy_end = u.shape[1]-1 

19: edef int i, j 

20: cdef double u_xx, u_yy 

a 

2: for i in range(Ix_start+l, Ix_end) 

23 for j in range(Iy_stərt+l. Iy _ end) 

24: uxx = u lli-l.jl - 2°%u_1li.jl + u Iisi, jl 
5: wyy = ulli. j-l) - 20 jl + w_Ali, jell 
26: uli.jl = 2%u_1li jl - u_2li.jl + \ 

27 Cx2u_xx + Cy2*u_yy + dt2°F14, jl 
28: # Boundary condition u=0 

29: j = Iy_start 

0: for i in range(Ix_start, Ix_end+l): uli.j) = 0 

an j = Iy_end 

32: for i In range(Ix_start. Ix_end+l): uli. jl = 0 

3 i = Ix_start 

3: for j in range(Iy_start, Iy_end+l): uli. j] = 0 

5: i = Iy_end 


in range(ly start, Iy end+1): uli.j] = 0 


Fig. C.1 Visual illustration of Cython’s ability to translate Python to C 


! http://docs.cython.org/src/userguide/memoryviews.html 
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did not manage to translate to efficient C code and that remain in Python. For 
the present code we see that Cython is able to translate all the loops with array 
computing to C, which is our primary goal. 

You can also inspect the generated C code directly, as it appears in the file 
wave2D_u0_loop_cy.c. Nevertheless, understanding this C code requires some 
familiarity with writing Python extension modules in C by hand. Deep down in the 
file we can see in detail how the compute-intensive statements have been translated 
into some complex C code that is quite different from what a human would write 
(at least if a direct correspondence to the mathematical notation was intended). 


C.5.3 Building the Extension Module 


Cython code must be translated to C, compiled, and linked to form what is known 
in the Python world as a C extension module. This is usually done by making a 
setup.py script, which is the standard way of building and installing Python soft- 
ware. For an extension module arising from Cython code, the following setup.py 
script is all we need to build and install the module: 


from distutils.core import setup 
from distutils.extension import Extension 
from Cython.Distutils import build_ext 


cymodule = ’wave2D_u0_loop_cy’ 

setup( 
name=cymodule 
ext_modules=[Extension(cymodule, [cymodule + ’.pyx’],)], 
cmdclass={’build_ext’: build_ext}, 

) 


We run the script by 


Terminal 


Terminal> python setup.py build_ext --inplace 


The -inplace option makes the extension module available in the current directory 
as the file wave2D_u0_loop_cy.so. This file acts as a normal Python module that 
can be imported and inspected: 


>>> import wave2D_u0_loop_cy 

>>> dir(wave2D_u0_loop_cy) 

PPL buit insi 9, Yel 9, ene Pein, 
»__package__’, ’__test__’, ’advance’, ’np’] 


The important output from the dir function is our Cython function advance (the 
module also features the imported numpy module under the name np as well as 
many standard Python objects with double underscores in their names). 
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The setup. py file makes use of the distutils package in Python and Cython’s 
extension of this package. These tools know how Python was built on the com- 
puter and will use compatible compiler(s) and options when building other code in 
Cython, C, or C++. Quite some experience with building large program systems 
is needed to do the build process manually, so using a setup.py script is strongly 
recommended. 


Simplified build of a Cython module 
When there is no need to link the C code with special libraries, Cython offers a 
shortcut for generating and importing the extension module: 


import pyximport; pyximport.install() 


This makes the setup. py script redundant. However, in the wave2D_u0_adv. py 
code we do not use pyximport and require an explicit build process of this and 
many other modules. 


C.5.4 Calling the Cython Function from Python 


The wave2D_u0_loop_cy module contains our advance function, which we now 
may call from the Python program for the wave equation: 


import wave2D_u0_loop_cy 
advance = wave2D_u0_loop_cy.advance 


aoe fal aval, Te ESNE # time loop 
n aliaa = anlGayy Vay CED # precompute, size as u 
u = advance(u, u_n, u_nmi, f_a, x, y, t, Cx2, Cy2, dt2) 


Efficiency For a mesh consisting of 120x 120 cells, the scalar Python code requires 
1370 CPU time units, the vectorized version requires 5.5, while the Cython version 
requires only 1! For a smaller mesh with 60 x 60 cells Cython is about 1000 times 
faster than the scalar Python code, and the vectorized version is about 6 times slower 
than the Cython version. 


C.6 Migrating Loops to Fortran 


Instead of relying on Cython’s (excellent) ability to translate Python to C, we can 
invoke a compiled language directly and write the loops ourselves. Let us start with 
Fortran 77, because this is a language with more convenient array handling than 
C (or plain C++), because we can use the same multi-dimensional indices in the 
Fortran code as in the numpy arrays in the Python code, while in C these arrays 
are one-dimensional and require us to reduce multi-dimensional indices to a single 
index. 
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C.6.1 The Fortran Subroutine 


We write a Fortran subroutine advance in a file wave2D_u0_loop_f77.f for im- 
plementing the updating formula (2.117) and setting the solution to zero at the 
boundaries: 


subroutine advance(u, u_n, u_nmi, f, Cx2, Cy2, dt2, Nx, Ny) 
integer Nx, Ny 
real*8 u(0:Nx,0:Ny), u_n(0:Nx,0:Ny), u_nmi(0:Nx,0:Ny) 
real*8 £(0:Nx,0:Ny), Cx2, Cy2, dt2 
integer i, j 
real*8 u_xx, u_yy 

Cf2py intent(in, out) u 


C Scheme at interior points 
do j = 1, Ny-i 
do i = 1, Nx 
u_xx = u_n(i-1,j) - 2*u_n(i,j) + u_n(it1,j) 
u_yy = u_n(i,j-1) - 2*u_n(i,j) + u_n(i,j+1) 
u(i,j) = 2*u_n(i,j) - u_nmi(i,j) + Cx2*u_xx + Cy2*u_yy + 
& dt2*f (i,j) 
end do 
end do 


C Boundary conditions 

j=0 

do i = 0, Nx 
u(i,j) = 0 

end do 

J Ny; 

do i = 0, Nx 
u(i,j) = 0 

end do 

i=0 

do j = 0, Ny 
u(i,j) =0 

end do 

i = Nx 

do j = 0, Ny 
u(i,j) = 0 

end do 

return 

end 


This code is plain Fortran 77, except for the special Cf2py comment line, which 
here specifies that u is both an input argument and an object to be returned from 
the advance routine. Or more precisely, Fortran is not able return an array from 
a function, but we need a wrapper code in C for the Fortran subroutine to enable 
calling it from Python, and from this wrapper code one can return u to the calling 
Python code. 


Tip: Return all computed objects to the calling code 
It is not strictly necessary to return u to the calling Python code since the 
advance function will modify the elements of u, but the convention in Python 
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is to get all output from a function as returned values. That is, the right way of 
calling the above Fortran subroutine from Python is 


u = advance(u, u_n, u mmi f, Cx2, Cy2, dt2) 


The less encouraged style, which works and resembles the way the Fortran sub- 
routine is called from Fortran, reads 


advance(u, u_n, u_nmi, f, Cx2, Cy2, dt2) 


C.6.2 Building the Fortran Module with f2py 


The nice feature of writing loops in Fortran is that, without much effort, the tool 
f2py can produce a C extension module such that we can call the Fortran version 
of advance from Python. The necessary commands to run are 


Terminal 


Terminal> f2py -m wave2D_u0_loop_f77 -h wave2D_u0_loop_f77.pyf \ 
--overwrite-signature wave2D_u0_loop_f77.f 

Terminal> f2py -c wave2D_u0_loop_f77.pyf --build-dir build_f77 \ 
-DF2PY_REPORT_ON_ARRAY_COPY=1 wave2D_u0_loop_f77.f 


The first command asks f2py to interpret the Fortran code and make a Fortran 90 
specification of the extension module in the file wave2D_u0_loop_f77.pyf. The 
second command makes f2py generate all necessary wrapper code, compile our 
Fortran file and the wrapper code, and finally build the module. The build process 
takes place in the specified subdirectory build_f77 so that files can be inspected 
if something goes wrong. The option -DF2PY_REPORT_ON_ARRAY_COPY=1 makes 
f2py write a message for every array that is copied in the communication between 
Fortran and Python, which is very useful for avoiding unnecessary array copying 
(see below). The name of the module file is wave2D_u0_loop_f77.so, and this 
file can be imported and inspected as any other Python module: 


>>> import wave2D_u0_loop_f77 
>>> dir(wave2D_u0_loop_f77) 
R doc 2n 2 file. 2, > name 2, -package 2, 
»__version__’, ’advance’] 
>>> print wave2D_u0_loop_f77.__doc__ 
This module ’wave2D_u0_loop_f77’ is auto-generated with f2py.... 
Functions: 
u = advance(u,u_n,u_nm1,f,cx2,cy2,dt2, 
nx=(shape(u,0)-1) ,ny=(shape(u,1)-1)) 


Examine the doc strings! 

Printing the doc strings of the module and its functions is extremely important 
after having created a module with f2py. The reason is that f2py makes Python 
interfaces to the Fortran functions that are different from how the functions are 
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declared in the Fortran code (!). The rationale for this behavior is that f2py 
creates Pythonic interfaces such that Fortran routines can be called in the same 
way as one calls Python functions. Output data from Python functions is always 
returned to the calling code, but this is technically impossible in Fortran. Also, 
arrays in Python are passed to Python functions without their dimensions be- 
cause that information is packed with the array data in the array objects. This 
is not possible in Fortran, however. Therefore, f2py removes array dimensions 
from the argument list, and f2py makes it possible to return objects back to 
Python. 


Let us follow the advice of examining the doc strings and take a close look at the 
documentation f2py has generated for our Fortran advance subroutine: 


>>> print wave2D_u0_loop_f77.advance.__doc__ 
This module ’wave2D_u0_loop_f77’ is auto-generated with f2py 
Functions: 
u = advance(u,u_n,u_nmi,f,cx2,cy2,dt2, 
nx=(shape(u,0)-1) ,ny=(shape(u,1)-1)) 


advance - Function signature: 
u = advance(u,u_n,u_nmi,f,cx2,cy2,dt2, [nx,ny]) 

Required arguments: 
u : input rank-2 array(’d’) with bounds (nx + 1,ny + 1) 
u_n : input rank-2 array(’d’) with bounds (nx + 1,ny + 1) 
u nmi : input rank-2 array(’d’) with bounds (nx + i,ny + 1) 
f : input rank-2 array(’d’) with bounds (nx + 1,ny + 1) 
cx2 : input float 
cy2 : input float 
dt2 : input float 

Optional arguments: 
nx := (shape(u,0)-1) input int 
ny (shape(u,1)-1) input int 

Return objects: 
u : rank-2 array(’d’) with bounds (nx + 1,ny + 1) 


Here we see that the nx and ny parameters declared in Fortran are optional argu- 
ments that can be omitted when calling advance from Python. 

We strongly recommend to print out the documentation of every Fortran function 
to be called from Python and make sure the call syntax is exactly as listed in the 
documentation. 


C.6.3 How to Avoid Array Copying 


Multi-dimensional arrays are stored as a stream of numbers in memory. For a two- 
dimensional array consisting of rows and columns there are two ways of creating 
such a stream: row-major ordering, which means that rows are stored consecutively 
in memory, or column-major ordering, which means that the columns are stored one 
after each other. All programming languages inherited from C, including Python, 
apply the row-major ordering, but Fortran uses column-major storage. Thinking of 
a two-dimensional array in Python or C as a matrix, it means that Fortran works 
with the transposed matrix. 
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Fortunately, f 2py creates extra code so that accessing u(i, j ) in the Fortran sub- 
routine corresponds to the element u [i,j] in the underlying numpy array (without 
the extra code, u(i, j) in Fortran would access u [j,i] in the numpy array). Tech- 
nically, f2py takes a copy of our numpy array and reorders the data before sending 
the array to Fortran. Such copying can be costly. For 2D wave simulations on a 
60 x 60 grid the overhead of copying is a factor of 5, which means that almost the 
whole performance gain of Fortran over vectorized numpy code is lost! 

To avoid having f 2py to copy arrays with C storage to the corresponding Fortran 
storage, we declare the arrays with Fortran storage: 


order = ’Fortran’ if version == ’f77’ else ’C’ 

u = zeros((Nx+1,Ny+1), order=order) # solution array 

u_n = zeros((Nx+1,Ny+1), order=order) # solution at t-dt 
u_nmi = zeros((Nx+1,Ny+1), order=order) # solution at t-2*dt 


In the compile and build step of using f2py, it is recommended to add an extra 
option for making f2py report on array copying: 


Terminal 


Terminal> f2py -c wave2D_u0_loop_f77.pyf --build-dir build_f77 \ 
-DF2PY_REPORT_ON_ARRAY_COPY=1 wave2D_u0_loop_f77.f 


It can sometimes be a challenge to track down which array that causes a copying. 
There are two principal reasons for copying array data: either the array does not 
have Fortran storage or the element types do not match those declared in the Fortran 
code. The latter cause is usually effectively eliminated by using real *8 data in the 
Fortran code and float64 (the default float type in numpy) in the arrays on the 
Python side. The former reason is more common, and to check whether an array 
before a Fortran call has the right storage one can print the result of isfortran(a), 
which is True if the array a has Fortran storage. 

Let us look at an example where we face problems with array storage. A typical 
problem in the wave2D_u0. py code is to set 


f_a = f(xv, yv, tm) 


before the call to the Fortran advance routine. This computation creates a new 
array with C storage. An undesired copy of f_a will be produced when sending 
f_a to a Fortran routine. There are two remedies, either direct insertion of data in 
an array with Fortran storage, 


f_a = zeros((Nx+1, Ny+1), order=’Fortran’) 
a eiod = Gao yu tin) 
or remaking the f (xv, yv, t[n]) array, 


f_a = asarray(f(xv, yv, t[n]), order=’Fortran’) 
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The former remedy is most efficient if the asarray operation is to be performed a 
large number of times. 


Efficiency The efficiency of this Fortran code is very similar to the Cython code. 
There is usually nothing more to gain, from a computational efficiency point of 
view, by implementing the complete Python program in Fortran or C. That will just 
be a lot more code for all administering work that is needed in scientific software, 
especially if we extend our sample program wave2D_u0.py to handle a real scien- 
tific problem. Then only a small portion will consist of loops with intensive array 
calculations. These can be migrated to Cython or Fortran as explained, while the 
rest of the programming can be more conveniently done in Python. 


C.7 Migrating Loops to C via Cython 


The computationally intensive loops can alternatively be implemented in C code. 
Just as Fortran calls for care regarding the storage of two-dimensional arrays, work- 
ing with two-dimensional arrays in C is a bit tricky. The reason is that numpy arrays 
are viewed as one-dimensional arrays when transferred to C, while C programmers 
will think of u, u_n, and u_nm1 as two dimensional arrays and index them like 
uli] [j]. The C code must declare u as double* u and translate an index pair 
[i] [j] to a corresponding single index when u is viewed as one-dimensional. This 
translation requires knowledge of how the numbers in u are stored in memory. 


C.7.1 Translating Index Pairs to Single Indices 


Two-dimensional numpy arrays with the default C storage are stored row by row. 
In general, multi-dimensional arrays with C storage are stored such that the last 
index has the fastest variation, then the next last index, and so on, ending up 
with the slowest variation in the first index. For a two-dimensional u declared as 
zeros ((Nx+1,Ny+1)) in Python, the individual elements are stored in the follow- 
ing order: 


WOO, HNO i), HOA, sso5 MIO Mill, wil Ol, willy), son, 
wd MIAO, cacy WIRbesll, Wiel, aoay Mii ind 


Viewing u as one-dimensional, the index pair (i, j ) translates to i (N, + 1) + J. 
So, where a C programmer would naturally write an index u[i] [j], the indexing 
must read u[i*(Ny+1) + j]. This is tedious to write, so it can be handy to define 
a C macro, 


#define idx(i,j) (i)*(Ny+1) + j 


so that we can write u [idx (i, j)], which reads much better and is easier to debug. 
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Be careful with macro definitions 

Macros just perform simple text substitutions: idx (hello, world) is expanded 
to (hello)*(Ny+1) + world. The parentheses in (i) are essential — us- 
ing the natural mathematical formula i*(Ny+1) + j in the macro definition, 
idx(i-1,j) would expand to i-1*(Ny+1) + j, which is the wrong formula. 
Macros are handy, but require careful use. In C++, inline functions are safer and 
replace the need for macros. 


C.7.2 The Complete C Code 


The C version of our function advance can be coded as follows. 


#define idx(i,j) (i)*(Ny+1) + j 


void advance(double* u, double* u_n, double* u_nmi, double* f, 
double Cx2, double Cy2, double dt2, int Nx, int Ny) 
{ 
aiana aby 918 
double u_xx, u_yy; 
/* Scheme at interior points */ 
for (i=1; i<=Nx-1; i++) { 
for (j=1; j<=Ny-1; j++) { 
mees = M eea Gi D = 24uen idx (G5) uenlidx Gri); 
u_yy = u_n[idx(i,j-1)] - 2*u_n[idx(i,j)] + u_n[idx(i,j+1)]; 
ulidx(i,j)] = 2*u_n[idx(i,j)] - u_nmi[idx(i,j)] + 
Cx2*u_xx + Cy2*u_yy + dt2*f[idx(i,j)]; 
I 
} 
/* Boundary conditions */ 
j = 0; for (i=0; i<=Nx; i++) ulidx(i,j)] 
j = Ny; for (i=0; i<=Nx; i++) ulidx(i,j)] 
ac og for (j=0; j<=Ny; j++) ulidx(i,j)] 
i= Nx; for (j=0; j<=Ny; j++) ulidx(i,j)] 


"Ooo 
Qe O 


C.7.3 The Cython Interface File 


All the code above appears in the file wave2D_u0_loop_c.c. We need to compile 
this file together with C wrapper code such that advance can be called from Python. 
Cython can be used to generate appropriate wrapper code. The relevant Cython 
code for interfacing C is placed in a file with extension .pyx. This file, called 
wave2D_u0_loop_c_cy.pyx’, looks like 


import numpy as np 
cimport numpy as np 
cimport cython 


? http://tinyurl.com/nu656p2/softeng2/wave2D_u0_loop_c_cy.pyx 
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cdef extern from "wave2D_u0_loop_c.h": 
void advance(double* u, double* u_n, double* u_nmi, double f, 
double Cx2, double Cy2, double dt2, 
int Nx, int Ny) 


@cython. boundscheck (False) 
@cython.wraparound (False) 
def advance_cwrap( 
np.ndarray[double, ndim=2, mode=’c’] u, 
np.ndarray[double, ndim=2, mode=’c’] u_n, 
np.ndarray[double, ndim=2, mode=’c’] u_nmi, 
np.ndarray[double, ndim=2, mode=’c’] f, 
double Cx2, double Cy2, double dt2): 
advance(&u[0,0], &u_n[0,0], &u_nmi[0,0], &f[0,0], 
Cx2, Cy2, dt2, 
u.shape[0]-1, u.shape[1]-1) 
return u 


We first declare the C functions to be interfaced. These must also appear in a C 
header file, wave2D_u0_loop_c.h, 


extern void advance(double* u, double* u_n, double* u_nmi, double* f, 
double Cx2, double Cy2, double dt2, 
int Nx, int Ny); 


The next step is to write a Cython function with Python objects as arguments. The 
name advance is already used for the C function so the function to be called from 
Python is named advance_cwrap. The contents of this function is simply a call 
to the advance version in C. To this end, the right information from the Python 
objects must be passed on as arguments to advance. Arrays are sent with their C 
pointers to the first element, obtained in Cython as &u[0, 0] (the & takes the address 
of a C variable). The Nx and Ny arguments in advance are easily obtained from the 
shape of the numpy array u. Finally, u must be returned such that we can set u = 
advance(...) in Python. 


C.7.4 Building the Extension Module 
It remains to build the extension module. An appropriate setup. py file is 


from distutils.core import setup 
from distutils.extension import Extension 
from Cython.Distutils import build_ext 


sources = [’wave2D_u0_loop_c.c’, ’wave2D_u0_loop_c_cy.pyx’] 
module = ’wave2D_u0_loop_c_cy’ 
setup ( 


name=module, 
ext_modules=[Extension(module, sources, 
libraries=[], # C libs to link with 
DIE 
cmdclass={’build_ext’: build_ext}, 
) 
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All we need to specify is the .c file(s) and the . pyx interface file. Cython is au- 
tomatically run to generate the necessary wrapper code. Files are then compiled 
and linked to an extension module residing in the file wave2D_u0_loop_c_cy.so. 
Here is a session with running setup.py and examining the resulting module in 
Python 


Terminal 


Terminal> python setup.py build_ext --inplace 
Terminal> python 
>>> import wave2D_u0_loop_c_cy as m 


>>> dir(m) 
2 i i 2 2 2 $ i 2 2 $ $ 2 
— — > — — > — — > — — | ——. — > 
{?__builtins doc file name package 
»__test__’, ’advance_cwrap’, ’np’] 


The call to the C version of advance can go like this in Python: 


import wave2D_u0_loop_c_cy 
advance = wave2D_u0_loop_c_cy.advance_cwrap 


f_al:,:] = f(av, yv, tIn]) 
u = advance(u, u_n, u_nmi, f_a, Cx2, Cy2, dt2) 


Efficiency In this example, the C and Fortran code runs at the same speed, and 
there are no significant differences in the efficiency of the wrapper code. The over- 
head implied by the wrapper code is negligible as long as there is little numerical 
work in the advance function, or in other words, that we work with small meshes. 


C.8 Migrating Loops to C via f2py 


An alternative to using Cython for interfacing C code is to apply f2py. The C 
code is the same, just the details of specifying how it is to be called from Python 
differ. The f2py tool requires the call specification to be a Fortran 90 module 
defined in a .pyf file. This file was automatically generated when we interfaced a 
Fortran subroutine. With a C function we need to write this module ourselves, or 
we can use a trick and let f2py generate it for us. The trick consists in writing the 
signature of the C function with Fortran syntax and place it in a Fortran file, here 
wave2D_u0_loop_c_f2py_signature. f: 


subroutine advance(u, u_n, u_nmi, f, Cx2, Cy2, dt2, Nx, Ny) 
Cf2py intent(c) advance 
integer Nx, Ny, N 
real*8 u(0:Nx,0:Ny), u_n(O:Nx,0:Ny), u_nmi(0:Nx,0:Ny) 
real*8 £(0:Nx, O:Ny), Cx2, Cy2, dt2 
Cf2py intent(in, out) u 
Cf2py intent o) Wy, Win, Wij, a, Oe, Cy, Chey, Weep, Ny 
return 
end 


C.8 Migrating Loops to C via f2py 489 


Note that we need a special f2py instruction, through a Cf2py comment line, to 
specify that all the function arguments are C variables. We also need to tell that the 
function is actually in C: intent(c) advance. 

Since f2py is just concerned with the function signature and not the complete 
contents of the function body, it can easily generate the Fortran 90 module specifi- 
cation based solely on the signature above: 


Terminal 


Terminal> f2py -m wave2D_u0_loop_c_f2py \ 
-h wave2D_u0_loop_c_f2py.pyf --overwrite-signature \ 
wave2D_u0_loop_c_f2py_signature.f 


The compile and build step is as for the Fortran code, except that we list C files 
instead of Fortran files: 


Terminal 


Terminal> f2py -c wave2D_u0_loop_c_f2py.pyf \ 
--build-dir tmp_build_c \ 
-DF2PY_REPORT_ON_ARRAY_COPY=1 wave2D_u0_loop_c.c 


As when interfacing Fortran code with f2py, we need to print out the doc string to 
see the exact call syntax from the Python side. This doc string is identical for the C 
and Fortran versions of advance. 


C.8.1 Migrating Loops to C++ via f2py 


C++ is a much more versatile language than C or Fortran and has over the last 
two decades become very popular for numerical computing. Many will therefore 
prefer to migrate compute-intensive Python code to C++. This is, in principle, easy: 
just write the desired C++ code and use some tool for interfacing it from Python. 
A tool like SWIG? can interpret the C++ code and generate interfaces for a wide 
range of languages, including Python, Perl, Ruby, and Java. However, SWIG is a 
comprehensive tool with a correspondingly steep learning curve. Alternative tools, 
such as Boost Python’, SIP*, and Shiboken® are similarly comprehensive. Simpler 
tools include PyBindGen’. 

A technically much easier way of interfacing C++ code is to drop the possibility 
to use C++ classes directly from Python, but instead make a C interface to the C++ 
code. The C interface can be handled by f2py as shown in the example with pure C 
code. Such a solution means that classes in Python and C++ cannot be mixed and 
that only primitive data types like numbers, strings, and arrays can be transferred 
between Python and C++. Actually, this is often a very good solution because it 


3 http://swig.org/ 

4 http://www.boost.org/doc/libs/1_51_O/libs/python/doc/index.html 

5 http://riverbankcomputing.co.uk/software/sip/intro 

6 http://qt-project.org/wiki/Category:LanguageBindings::PySide::Shiboken 
7 http://code.google.com/p/pybindgen/ 


490 C Software Engineering; Wave Equation Model 


forces the C++ code to work on array data, which usually gives faster code than if 
fancy data structures with classes are used. The arrays coming from Python, and 
looking like plain C/C++ arrays, can be efficiently wrapped in more user-friendly 
C++ array classes in the C++ code, if desired. 


C.9 Exercises 


Exercise C.1: Explore computational efficiency of numpy.sum versus built-in 
sum 

Using the task of computing the sum of the first n integers, we want to compare 
the efficiency of numpy . sum versus Python’s built-in function sum. Use IPython’s 
%timeit functionality to time these two functions applied to three different argu- 
ments: range (n), xrange (n), and arange (n). 

Filename: sumn. 


Exercise C.2: Make an improved numpy . savez function 

The numpy . savez function can save multiple arrays to a zip archive. Unfortunately, 
if we want to use savez in time-dependent problems and call it multiple times (once 
per time level), each call leads to a separate zip archive. It is more convenient to 
have all arrays in one archive, which can be read by numpy.load. Section C.2 
provides a recipe for merging all the individual zip archives into one archive. An 
alternative is to write a new savez function that allows multiple calls and storage 
into the same archive prior to a final close method to close the archive and make it 
ready for reading. Implement such an improved savez function as a class Savez. 

The class should pass the following unit test: 


def test_Savez(): 

import tempfile, os 

tmp = ’tmp_testarchive’ 

database = Savez(tmp) 

for i in range(4): 
array = np.linspace(0, 5+i, 3) 
kwargs = {’myarray_/02d’ % i: array} 
database. savez(**kwargs) 

database .close() 


database = np.load(tmp+’ .npz’) 


expected = { 
?myarray_00’: nmp.array([ 0. , 2.5, 5. ]), 
*myarray_01’: np.array([0., 3., 6.]) 
*mMyarnay_0273) np.array (i 0. 3.5, 7%. 1), 
*myarray_03’: np.array([0., 4., 8.]), 
} 


for name in database: 
computed = database [name] 
diff = np.abs(expected[name] - computed) .max() 
assert diff < 1E-13 

database .close 

os .remove (tmp+’ .npz’) 
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Hint Study the source code® for function savez (or more precisely, function 
_savez). 
Filename: Savez. 


Exercise C.3: Visualize the impact of the Courant number 

Use the pulse function in the wave1D_dn_vc.py to simulate a pulse through two 
media with different wave velocities. The aim is to visualize the impact of the 
Courant number C on the quality of the solution. Set slowness_factor=4 and 
Nx=100. 

Simulate for C = 1, 0.9, 0.75 and make an animation comparing the three curves 
(use the animate_archives.py program to combine the curves and make anima- 
tions on the screen and video files). Perform the investigations for different types 
of initial profiles: a Gaussian pulse, a “cosine hat” pulse, half a “cosine hat” pulse, 
and a plug pulse. 

Filename: pulse1D_Courant. 


Exercise C.4: Visualize the impact of the resolution 

We solve the same set of problems as in Exercise C.3, except that we now fix C = 1 
and instead study the impact of At and Ax by varying the Nx parameter: 20, 40, 
160. Make animations comparing three such curves. 

Filename: pulse1D_Nx. 


8 https://github.com/numpy/numpy/blob/master/numpy/lib/npyio.py 
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staggered Euler-Cromer scheme, 46 
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